Trajectory-Based Anticipation of Hospital Crises via Decision-Oriented Neuro-Symbolic AI
ABSTRACT. Hospital crises rarely emerge as isolated events; they evolve as progressive trajectories of fragilization in which pressure accumulates, instability increases, and decision windows gradually close. Yet most operational AI remains event-centric, optimizing forecasts or alarms rather than supporting the sequential, constrained decisions that govern real crisis response. This paper proposes a decision-oriented neuro-symbolic approach to crisis anticipation, where the goal is not merely to predict failure but to preserve controllable time through governance-ready outputs: an interpretable operational state, a near-horizon escalation outlook, and an auditable Evidence-to-State-to-Trajectory-to-Action justification chain.
We instantiate the framework on 169 weeks of NHS weekly A&E sitreps published by NHS England, leveraging routinely reported indicators (demand, 4-hour performance, and severe delay markers) to extract weak-signal trajectory primitives capturing sustained strain and loss of stability. A symbolic layer then maps these primitives into a compact four-state fragilization timeline (LOW/STRAIN/UNSTABLE/CRITICAL) with explicit semantics and constrained escalation logic, ensuring stable and defensible transitions rather than noisy alerting. In time-respecting evaluation, the neuro component achieves approximately 94% discrimination (AUC ≈ 0.94) for next-week performance breach, while the neuro-symbolic fusion converts this predictive power into actionable, explainable-by-design escalation states suitable for operational governance.
The contribution is a deployable anticipation layer that reframes crisis AI around trajectories and decision windows: not “will a crisis happen,” but how resilience is eroding and what can still be changed before escalation becomes inevitable.
BugIQ A Neurosymbolic Graph Neural Network for Inductive Antimicrobial Resistance
ABSTRACT. Antimicrobial resistance poses a fundamental threat to global health, necessitating rapid identification of resistance potential in novel gene sequences. Current machine learning approaches often rely on transductive learning, performing well on known genes but failing to generalize to unseen sequences. We present BugIQ, a Neurosymbolic Graph Neural Network that integrates pre-trained protein language models (ESM-2) with a structured knowledge graph of drug chemistry and gene ontology. By enforcing a symbolic consistency loss within the embedding space, BugIQ learns generalizable biological rules rather than memorizing graph topology, isolating genuine biological signals from dataset biases. Under simulated clinical sequencing noise (sigma=0.5), BugIQ retains a 72.66% AUC, significantly outperforming traditional gradient boosting baselines that suffer catastrophic collapse (-48.5%). Our results show that symbolic grounding acts as a critical regularizer, enabling robust deployment in noisy clinical environments while providing white-box interpretability for uncatalogued resistance mechanisms.
Support Vector Quantum Kernels and Classical Machine-Learning Models in Predicting Drug-Induced Liver Injury
ABSTRACT. Abstract—Drug-induced liver injury (DILI) significantly contributes
to drug failures, prompting computational predictive
modeling efforts. Classical machine-learning (ML) approaches,
including support vector machines (SVM), achieve strong performance
but have limited interpretability and struggle with complex
interactions. This study compares a support vector quantum
kernel (SVQK) model with classical ML models (SVM, logistic
regression, k-nearest neighbors, ensembles) on a benchmark FDA
dataset of 475 drugs for DILI prediction. Quantum-enhanced
SVMs, implemented via quantum kernel estimation on IBM
quantum hardware, achieved comparable or slightly superior
performance (AUC-ROC ∼0.85–0.90) versus outperforming the
best classical baseline (AUC-ROC = 0.82 ± 0.04) under the same
16-feature constraint. Quantum kernels demonstrated robustness
to noise and maintained consistent performance with fewer
features. However, interpretability remains challenging, and
quantum models face translational hurdles due to hardware
limitations and regulatory acceptance. While quantum-enhanced
models exhibit promising potential to improve DILI prediction,
realizing a significant quantum advantage requires further
advancements in quantum hardware and algorithm
Human-Centric Cognitive Pattern Analysis and Smart Recommendation System for Early Alzheimer’s Risk Prediction
ABSTRACT. Alzheimer’s disease is among the most common, progressive neurodegenerative diseases that disrupt memory, thought, and other daily functions. Conventional diagnostic approaches can be expensive, time-consuming, and inaccessible. This article presents a human-centered, machine-learning and cognitive-pattern analysis application integrated with personalized recommendations for early risk assessment of Alzheimer’s disease. The data-driven application utilizes Machine Learning Tree Ensemble models including Random Forest, XGBoost and Gradient Boosting in a Stacked Ensemble architecture. The user-engaged application provides gamified cognitive assessments, such as a cognitive memory challenge and a reaction-time test that reflect cognitive agility. The analysis considers clinical, demographic, and lifestyle data to determine prediction accuracy of at-risk levels categorized as Low, Moderate and High. We utilized a large scale Alzheimer's dataset containing almost 100,000 samples and 39 attributes in our study. The performance results indicated that the proposed ensemble method obtained an accuracy of 99.03% on predicting risk. Overall, the web-based platform provides an easy-to-use dashboard for patients and caregivers and data-driven, accessible pre-emptive cognitive health care options.
Automated Quantification of Hand Resting Tremor in Parkinson’s Disease from Routine Clinical Video
ABSTRACT. Hand resting tremor is a hallmark manifestation of Parkinson’s disease (PD). Its clinical assessment predominantly depends on visual inspection and rating scales, which are inherently subjective and qualitative. This study introduces an automated and explainable framework for the quantitative evaluation of hand resting tremor using routine clinical videos captured in unconstrained settings. The proposed pipeline combines automatic hand detection, dense optical flow for motion analysis, and principal component analysis to generate a concise motion representation. Nine quantitative descriptors spanning spectral, temporal, and image-based domains were extracted to characterize tremor dynamics. Binary classifiers were trained to distinguish tremor from non-tremor cases through nested, subject-independent cross-validation. Nonlinear ensemble models, particularly Extra Trees, achieved balanced accuracies of up to 0.78 and AUC scores of up to 0.76, demonstrating robust subject-independent discrimination. Statistical analysis complemented by SHAP-based explainability identified dominant tremor frequency, spectral concentration within the 4–6 Hz band, tremor-band amplitude, motion periodicity, and inter-limb asymmetry as the most discriminative features, consistent with established clinical characteristics of Parkinsonian resting tremor. Spatial motion descriptors contributed comparatively less to classification performance. This framework offers a fully vision-based, low-cost, and interpretable method for objective tremor assessment, facilitating the development of scalable digital biomarkers for PD evaluation.
Modality-Embedded Set Transformer Pooling for Multimodal Prostate Cancer Survival Prediction
ABSTRACT. Biochemical recurrence (BCR) after curative-intent therapy for prostate cancer is assessed from heterogeneous evidence spanning histopathology, multiparametric MRI (mpMRI), and structured clinical variables. Learning robust multimodal predictors remains challenging due to small cohort sizes, modality-specific noise, and missingness. In this work, we study intermediate fusion for BCR risk prediction on the MICCAI CHIMERA benchmark under a controlled setup. We train attention-based multiple instance learning (ABMIL) aggregators to obtain patient-level embeddings for whole-slide histology patches and for sequence-wise MRI-CORE slice embeddings
(ADC/HBV/T2w), and then compare two fusion operators with matched modality projections, modality-type embeddings, and an identical prediction head: (i) a modality-token Transformer encoder with CLS readout (MEFT) and (ii) seeded cross-attention pooling with learned fusion tokens (MEST). All models are trained with the Cox objective and evaluated using a fixed 5-fold cross-validation with out-of-fold embedding generation to avoid leakage. Both fusion approaches achieve comparable
performance (C-index ≈ 0.85), with MEST showing slightly lower fold-to-fold variance. Ablations indicate that histology and clinical variables dominate performance, while mpMRI provides complementary gains when fused despite weaker monomodal performance. These results suggest that, on CHIMERA, the fusion operator has a modest effect under matched embeddings, and that preserving sequence-wise mpMRI representations can improve multimodal risk stratification.
OHANA: Optimizing Heterogeneous Multi-Artifact Correction in Neuroimaging Analysis
ABSTRACT. Magnetic Resonance Imaging (MRI) is highly susceptible to acquisition artifacts, which degrade image quality and compromise medical diagnosis. In clinical settings, the absence of robust recovery methods forces technologists to repeat corrupted scans, unnecessarily increasing operational costs and patient discomfort. Although Deep Learning methods have shown potential for correcting MRI artifacts, their development is hindered by the limited availability of artifact-corrupted images. When available, data remain dispersed across hospitals and cannot be centralized due to privacy and regulatory constraints. Compounding this, scanner heterogeneity requires models to be trained on multi-institutional data to achieve efficient correction.
To address these challenges, we propose OHANA, an end-to-end solution for synthetic artifact generation and correction in multi-contrast brain MRI. We first introduce a synthetic data generator that simulates multiple artifact types, creating artifact-corrupted training datasets. Using this augmented data, we extend MC2-Net to perform artifact correction across four MRI contrasts. OHANA also integrates Federated Learning, allowing multiple institutions to collaboratively train models without sharing data.
Experiments demonstrate that OHANA outperforms state-of-the-art artifact-correction approaches, achieving a 17.2% improvement in the Structural Similarity Index Measure. A radiologist assessed the realism of the generated artifacts and the diagnostic fidelity of the corrected images. These results highlight the potential of OHANA to improve medical diagnosis.
Automatic detection of EEG electrodes on T1-weighted MR Images
ABSTRACT. Electroencephalography (EEG) and functional Magnetic Resonance Imaging (fMRI) are two major functional brain imaging modalities. EEG and fMRI can be recorded simultaneously to measure brain activity and take advantage of both modalities, providing good temporal resolution from EEG signals and high spatial resolution from fMRI. Indeed, the spatial resolution of EEG signals is poor due to the ill-posed inverse problem of source localisation.
Previous work has enabled EEG electrode detection using Ultra-Short TE MR images.
Building upon a previously introduced method, we adapt and validate it for EEG electrode localization on T1-weighted MRI, thereby extending its applicability to a new imaging modality. By relying solely on T1-weighted images, which is commonly acquired in fMRI protocols, this approach is both simple and easily applicable to existing EEG-fMRI datasets.
Although detections are slightly less accurate than those obtained on ultra-short TE sequences, the results remain excellent, with an average detection accuracy of 99.27%, an average positioning error of 2.59 mm, and perfect accuracy in electrode labeling.
A Multiscale Framework for Functional Connectivity in Schizophrenia: Familial Validation and Phenotypic Stratification
ABSTRACT. Current neuroimaging studies face methodological limitations in neural-phenotypic integration for biomarker validation. We present a multiscale computational framework integrating graph-theoretic connectivity analysis with dual-threshold statistical validation and unsupervised phenotypic stratification. Our approach combines Welch's t-test, FDR correction, and effect size thresholds to prioritize biological relevance over statistical artifacts. Applied to task-based fMRI from 86 participants across the schizophrenia spectrum, we extracted nodal and network-level metrics from sparse weighted functional graphs. Familial validation through a four-group design dissociated state-dependent from trait-related markers, establishing cerebellar-cortical hypoconnectivity as illness-specific rather than genetic risk. Consensus clustering of cognitive and symptomatic features identified three stable phenotypic subtypes with distinct neuropsychological profiles, highlighting the potential utility of phenotype stratification. This framework demonstrates that integrating multiscale graph analysis with dual-threshold validation and unsupervised stratification establishes a reproducible computational standard for biomarker discovery, significantly enhancing biological interpretability beyond traditional neurological analysis.
Multi-Scale Fractal Analysis and Bidirectional Temporal Graph Networks for Alzheimer’s and Frontotemporal Dementia Detection Using Electroencephalography
ABSTRACT. Alzheimer’s disease (AD) and frontotemporal dementia (FTD) are two of the most prevalent neurodegenerative disorders, yet their overlapping clinical presentations pose substantial diagnostic challenges. Current diagnostic methods rely on
costly neuroimaging and subjective clinical assessments, creating
an urgent need for accessible, objective screening tools. This
paper presents a novel approach using a Bidirectional Temporal
Graph Convolutional Network (GCN)-Transformer framework
that incorporates comprehensive fractal pattern analysis to
automatically classify AD, FTD, and cognitively normal (CN)
individuals from EEG signals. 14 different features from each
EEG channel were extracted to build a complete picture: 3 time domain features (mean, variance, standard deviation), 4 fractal dimension measures (Higuchi, Petrosian, Katz Fractal Dimensions, and Detrended Fluctuation Analysis), and 7 frequency domain features (spectral entropy, band powers across delta,theta, alpha, beta, gamma bands, and peak frequency). The
aim of these multi-scale features is to effectively capture the
self-similar patterns and nonlinear dynamics characteristic of
neurodegenerative brain activity across all 19 EEG channels. The
architecture combines graph convolutional layers with graph constrained multi-head attention mechanisms operating on dynamic temporal adjacency matrices, with bidirectional processing strengthened by positional encoding. Ten-fold cross-validation of the dataset yielded training accuracy of 98.65±0.38% and
test accuracy of 97.00±0.45%, while leave-one-subject-out validation achieved 85.00% accuracy along with 86.23% precision, 85.00% recall, 85.34% F1-score, 85.00% sensitivity, and 92.50% specificity. Our ablation studies demonstrated clear advantages over standard Graphical Convolutional Network (94.12±0.78%)
and Graphical Attention Transformer variants (95.45±0.68%).
The proposed framework achieves competitive performance while
utilizing only 800k parameters, enabling efficient deployment
in resource-constrained clinical settings for early-stage dementia
screening.
Multi-point Nonlinear Crosstalk Correction for Fluorescence Microscopy
ABSTRACT. We present a novel multi-point calibration and correction framework for compensating nonlinear channel bleed-through (crosstalk) in epifluorescence microscopy, enabling
improved quantification in liquid biopsy assays. Singly labeled control slides spanning a broad fluorescence dynamic range are used to sample intensity-dependent crosstalk at multiple levels. For an imaging system with M detection channels, each
fluorophore’s bleed-through behavior is quantified using N calibration 1×M vectors corresponding to discrete intensity states. The resulting N^M precomputed correction matrices collectively model nonlinear crosstalk interactions.
In contrast to conventional Newton-based or general iterative solvers, which require explicit modeling of continuous intensity-dependent crosstalk and may incur substantial computational cost or convergence uncertainty, the proposed method employs a fixed set of precomputed matrices combined with an efficient positional
indexing of numerous crosstalk states obtained via multi-level image segmentation in the correction process. In a 2K × 2K four-channel fluorescence system, a tri-point implementation reduced processing time to 1.8 seconds, compared with 374 seconds for a two-iteration Newton solver. The method achieved correction improvement factors of 1.42 fold relative to a general iterative solver and 1.91 fold relative to a Newton solver. These results demonstrate that the multi-point framework provides both
substantial computational acceleration and improved correction accuracy for quantitative fluorescence imaging.
Longitudinal MRI Analysis Platform for Monitoring Disease Progression in Multiple Sclerosis
ABSTRACT. Longitudinal magnetic resonance imaging (MRI) is
essential for monitoring the progression of Multiple Sclerosis
(MS) disease. This work presents a lightweight end to end
integrated software platform for longitudinal brain MRI
analysis and progression assessment in MS. The proposed
platform integrates visualization, annotation management, pre
processing (Original (O), Histogram Normalized (HN),
Gaussian Smoothed (GS), Gaussian smoothed following
histogram normalization (GSHN)), rigid registration and
disease evolution quantification within a unified and persistent
workflow. Annotation propagation across different time points
(T1-T4) of the disease progression is supported through rigid
transformation reuse. This enables spatially aligned
initialization of segmented lesions and structured comparison of
stable, projected and newly appearing MS lesions. Experimental
evaluation on five longitudinal clinical cases, at T1-T4,
demonstrated consistent improvement in intra subject
alignment across all preprocessing variants using correlation
based rigid registration, with median MSE reductions of up to
0.57 and correlation coefficient, ρ, increasing up to 0.93. The
proposed system enables reproducible longitudinal studies,
lessens the burden of manual (M) delineations and annotations
and lays the groundwork for future research into lesions
evolution and the characterization of pre-lesional tissue as well
as automated lesions segmentation.
Process-Aware Conformal Prediction for Ambulance Response Time Estimation
ABSTRACT. Accurate prediction of ambulance response times is
critical for emergency medical services (EMS) dispatch. Existing
Machine Learning (ML) models mostly depend on the quality
of the training data provided, and often fail to communicate the
complex uncertainty inherent in EMS data. This paper presents
a process-aware conformal prediction framework that provides
distribution-free prediction intervals with finite-sample coverage
guarantees for EMS response times. The response process is decomposed
into four operational phases: Call processing, dispatch,
crew mobilization, and travel. Conformal Quantile Regression
(CQR) is applied independently to each stage to compose intervals
that maintain coverage guarantee based on the provided data.
Four conformal variants are evaluated on a synthetic dataset of
75,000 incidents calibrated and adjusted to reflect the Cyprus
operational model. Findings suggest that the CQR produces the
tightest intervals out of the four methods and would be the
preferred conformal variant for real-time dispatch application
when the objective is to minimize the interval width while
maintaining coverage. Per-stage decomposition produces wider
intervals but provides diagnostic information useful for quality
assurance and process improvement.
ClinCode Copilot: An Interactive Clinical Coding Assistant with Dual Interpretability
ABSTRACT. The assignment of International Classification of Diseases (ICD) codes to clinical encounters is a labor-intensive process that is both error-prone and costly. While automated coding models have made substantial progress on benchmark datasets, they typically operate as batch-oriented black boxes that produce ranked code lists without supporting evidence. Clinical coders, however, require interactive tools that explain predictions in terms they can verify against the source document.
We present ClinCode Copilot, a web-based clinical coding assistant that integrates a hybrid machine learning model with an interactive workspace designed for real-time ICD code review. The system combines an end-to-end label attention classifier with chunk-level K-nearest neighbor retrieval to provide dual interpretability: per-code attention highlighting identifies which sections of a discharge summary support each prediction, while similar patient retrieval surfaces training cases that received the same code. Built with a FastAPI backend serving a Bio_ClinicalBERT encoder and FAISS index, and a Next.js frontend with resizable multi-panel layout, ClinCode Copilot enables clinical coders to review predictions, inspect evidence, and verify codes within a single interface. The underlying model achieves Micro-F1 of 0.482 on 1275 ICD-9 codes from MIMIC-III. The tool is open-source, and the code is publicly available at https://github.com/ieeta-mith/clincode-copilot.
ChemoTwin: A Digital Twin Framework for Real-Time Chemotherapy Toxicity Prediction
ABSTRACT. Chemotherapy is essential in cancer treatment
but often causes serious toxicities such as cardiotoxicity and
neutropenia. In current practice, toxicity monitoring relies mainly
on population guidelines and periodic visits, so adverse events
are often detected late. ChemoTwin is a clinical decision support
prototype that uses a digital twin approach to help clinicians
monitor chemotherapy toxicity. The system combines continuous
monitoring of patient data, toxicity risk prediction, and access to
clinical guidelines so that recommendations stay evidence-based.
The prototype brings together more than 15 clinical parameters
and estimates risks for key adverse events. Preliminary evaluation
shows promising prediction performance and positive feedback
from oncology practitioners, supporting the feasibility of this
approach for future clinical validation.
A telemonitoring framework for arteriovenous graft and vital signs
ABSTRACT. Chronic kidney disease requires long-term clinical management, and many patients undergo hemodialysis several times per week.
In some cases, particularly among older patients, an arteriovenous graft is implanted to enable vascular access.
Although effective, graft implantation carries risks, including infections, occlusions, and graft failure, which can be life-threatening.
TeleGraft is a European project that aims to develop an intelligent arteriovenous graft for hemodialysis patients and to enable regular monitoring of blood flow anomalies and infections, through embedded sensors and Raman spectroscopy.
This paper presents a telemonitoring system designed to monitor biosignals acquired from arteriovenous grafts and to detect potential complications.
The system is also designed to be modular, adaptable by enabling monitoring of other types of biosignals, and support patients with distinct medical conditions.
As a result, we developed a web application to allow healthcare professionals to monitor the status of the graft and patient data, but also a mobile application that supports the acquisition and submission of various health biosignals to the telemonitoring platform.
Your Health in Your Hand: Designing a Minimum Viable Product for Citizen Empowerment
ABSTRACT. The digital transformation of healthcare in Europe is currently undergoing a paradigm shift, moving from provider-centric systems toward a patient-oriented ecosystem mandated by the European Health Data Space regulation. While the digitalization of hospital infrastructures has been largely successful, citizens often remain passive subjects with limited control over their clinical data. This paper presents the design and implementation of a National Mobile Health Application developed as a Minimum Viable Product to bridge the gap between legislative requirements and practical citizen empowerment. The proposed solution integrates an Electronic Health Record viewer, secure medical sharing via the Verifiable Health Links protocol, and teleconsultation services within a unified, cross-platform interface. A key contribution of this work is the integration of the European Digital Identity Wallet for high-assurance authentication and the OpenNCP framework to facilitate cross-border semantic interoperability through the MyHealth@EU infrastructure. By utilizing open standards such as HL7 FHIR and CDA, the MVP demonstrates a scalable framework that enables citizens to exercise their rights to data access, portability, and insertion. Our results highlight how the synergy of these components fosters health literacy and therapeutic equity, providing a technical blueprint for National Health Authorities to comply with the EHDS and ensure that clinical data remains actionable across borders.
Cognitive Function as a Predictor of Fall Risk in Older Adults Participating in Cognitive and Physical Training Programs
ABSTRACT. Falls are a leading cause of injury and loss of independence among older adults. While physical factors such as balance and muscle strength are known contributors to fall risk, increasing evidence suggests that cognitive function, particularly executive function, attention, and processing speed, also plays a significant role. Understanding the relationship between cognitive performance and fall risk may help improve fall prevention strategies. This study aimed to examine whether cognitive function predicts fall risk in older adults participating in a combined cognitive and physical training program. A sample of community-dwelling older adults aged 60 years and above participated in a structured cognitive and physical training program titled LLM Care. Cognitive function was assessed using standardized neuropsychological tests evaluating global cognition, executive function, and attention. Fall risk was measured using established functional mobility and balance assessments. Participants completed a multi-week intervention consisting of cognitive training exercises and physical activities focusing on balance, strength, and coordination. Statistical analyses were conducted to examine associations between cognitive performance and fall risk outcomes.
Improving Antimicrobial Peptide Classification via k-Means and q-value Guided Feature Grouping and Ranking
ABSTRACT. Antimicrobial peptides (AMPs) have emerged as strong candidates to replace traditional antibiotics in response to the escalating problem of antimicrobial resistance. With the rapid progress of computational biology and the structural complexity of AMPs, reliable machine learning–based classification approaches have become increasingly essential. In this work, we introduce a novel statistical framework designed for systematic feature grouping, scoring, and selection to improve AMP prediction performance. The proposed approach applies k-means clustering to partition peptide features into coherent groups and prioritizes them using q-values obtained from z-score–based statistical analysis. The framework was validated on two large-scale AMP benchmark datasets comprising 3,556 and 12,022 peptide sequences. Feature subsets derived from the ranked groups were used to train seven machine learning classifiers along with their ensemble variants. The results indicate that the proposed method surpasses existing grouping-based strategies on Dataset 1, achieving an accuracy of 0.9238 ± 0.0092, and delivers competitive performance on Dataset 2 with an accuracy of 0.8667 ± 0.0083.
Network-based Characterization of Synergistic Drug Combinations in Alzheimer’s Disease
ABSTRACT. Alzheimer’s disease is a multifactorial neurodegenerative disorder for which single-target pharmacological strategies have shown limited clinical success. The complexity of its underlying molecular mechanisms suggests that effective treatment may require the simultaneous modulation of multiple pathological pathways. In this context, drug combination therapies represent a promising alternative, although the experimental exploration of all possible combinations is unfeasible due to the vast combinatorial space. Network medicine provides a framework to address this challenge by modeling diseases and drugs within the human protein–protein interaction network and quantifying their topological relationships. In this study, we present a network-based pipeline to systematically evaluate potential drug combinations for Alzheimer by integrating drug–disease proximity and drug–drug separation metrics. We apply this framework to a large set of approved and investigational drugs in order to identify combinations that are both proximal to the disease module and topologically complementary. We specifically prioritize the combination of memantine and glimepiride, which exhibits an ideal complementary exposure pattern (sAB = 1.1). Our results reveal that while direct interaction with the disease module is limited (1.4%), this pair influences 356 unique proteins (13.2% of the total Alzheimer’s disease module) through 2-hop propagation paths. Overall, this framework identifies drug pairs that target complementary disease pathways, providing a clear biological rationale for developing new combination therapies for Alzheimer’s.
Out-of-Distribution Detection in Drug-Drug Interactions via Supervised Contrastive Learning
ABSTRACT. Μuch progress has been noted in the prediction of
drug-drug interactions with machine learning. Usually, this is
performed in the context of closed world assumption, whereby all
training and test data stem from the same distribution. In reality,
the test data can be a mixture of in and and out of distribution
(OoD) instances. However, the predictor would misclassify the
out of OoD instances, leading to erroneous predictions and thus
misleading clinicians.
In the current work we apply supervised contrastive learning
(SCL) to detect OoD instances, and distinguish them from in-
distribution instances. In particular, the OoD instances are un-
known interactions, i.e. interactions that have not been observed
in the training set. The method applies SCL on embeddings
of drug pairs, but also on the penultimate layer, and the on
the output of an neural network that was trained to predict
in-distribution instances. We compare the there aforementioned
approaches to a baseline k-NN based method that does not use
SCL. The role of SCL is to enforce a strong separation of the
in-distribution and OoD instances. We also, study the role of
negative samples representing lack of any interaction among drug
pairs. Moreover, there are experiments with different mixtures
of in-distribution (observed) and OoD (unobserved) interactions.
Results are presented on a public data extracted from Twosides
data set, and by and large SCL performs better than the baseline
method.
Overcoming Small-Degree Cold-Start in Heterogeneous Biological Networks through Latent Bridges
ABSTRACT. Predicting drug-disease interactions is essential for pharmacological discovery, yet most computational models underperform in cold-start scenarios where entities have limited interaction history. Standard Graph Neural Networks (GNNs) typically require dense connectivity, leaving sparse nodes as topological informationally isolated within biological networks. This study addresses such sparsity by constructing a unified heterogeneous knowledge graph using drug-gene (ChG-Miner) and drug-disease (DCh-Miner) associations from the BioSNAP database. We demonstrate that gene-target interactions serve as a latent bridge, allowing the model to infer context for drugs with minimal disease history through their genomic profiles. We evaluate two GNN architectures — GraphSAGE and Graph Attention Networks (GAT) — under severe sparsity, where over 50% of nodes exhibit a degree d ≤ 1.Results reveal a significant architectural trade-off: while GAT is effective in data-rich regions, its performance drops to a ROC-AUC of 0.79 for degree-1 nodes. In contrast, GraphSAGE maintains a ROC-AUC of 0.97 in the same sparse regions. This suggests that attention mechanisms become a liability when local neighborhoods are insufficient for statistical weighting. Our findings indicate that while attention-based models excel in well-annotated contexts, mean-pooling aggregators like GraphSAGE provide superior robustness for rare diseases and novel compound association prediction.
ABSTRACT. Accurate and reproducible identification of sentences describing adverse drug events (ADEs) is essential for pharmacovigilance in regulated environments. Although frontier large language models (LLMs) demonstrate strong general-purpose capabilities, their deployment in medical settings requires predictable, auditable, and computationally feasible operation. Proprietary LLM services are frequently updated and non-transparent, complicating validation in regulated workflows. Recent studies report moderate ADE sentence classification performance (F1 ≈ 0.84–0.85) using zero-shot and few-shot prompting with GPT-4–class models.
We present a validation-aware, size-stratified evaluation framework for sentence-level ADE classification on the deduplicated ADE-Corpus-V2 dataset, where ADE-positive sentences constitute approximately 20% of the dataset. Models ranging from 110 million to 7 billion parameters were trained and evaluated undera fixed single-GPU configuration (NVIDIA RTX 4000 Ada, 20 GB). To reflect operational performance under class imbalance, we report weighted F1 together with pharmacovigilance-oriented ADE-positive precision, recall, and F1 for all evaluated models. Biomedical domain-specific pretraining combined with supervised adaptation demonstrated greater practical impact than increasing model scale alone. Mid-sized biomedical PLMs (approximately 300–400M parameters) achieved strong and comparable ADE-positive performance with balanced precision and recall across repeated runs. For example, GatorTron-base (345M) achieved ADE-positive precision 0.875 ± 0.011, recall 0.892 ± 0.021, and F1 0.883 ± 0.006, while maintaining high weighted F1 (0.952 ± 0.002).
These findings indicate that reliable ADE sentence classification can be achieved without reliance on extremely large proprietary LLM services. Mid-sized biomedical PLMs provide a favorable balance between safety-oriented sensitivity and practical deployment considerations, offering actionable guidance for validation-oriented model selection in regulated pharmacovigilance workflows.
Developing and Evaluating Candidate Surrogate Outcomes for Alzheimer’s Disease Clinical Trials: A Simulation Study
ABSTRACT. The high failure rate of Alzheimer's disease trials is partly attributable to the lack of valid surrogate outcomes that reliably predict patient-relevant benefit. This study introduces multibiomarker putative surrogate outcomes, integrating amyloid, tau and neurodegeneration features to address this gap. Using a comprehensive simulation framework, we evaluated unweighted and variance-weighted multibiomarker models against the standard reduction in amyloid load benchmark under realistic trial conditions, including informative censoring and heterogeneous progression rates. Results demonstrate that while the single-marker benchmark failed to predict cognitive decline, the variance-weighted multibiomarker outcome significantly optimised the signal-to-noise ratio, increasing statistical power from 44 percent to 77 percent in small-sample trials. Furthermore, the model established a valid surrogate threshold effect for the Alzheimer's Disease Assessment Scale–Cognitive Subscale 13 but no such validation was established for the Mini-Mental State Examination. These findings confirm that variance-based feature engineering is essential for constructing robust multivariable surrogate endpoints, ultimately enabling faster and more reliable evaluation of new Alzheimer’s disease treatments.
ABSTRACT. Acute Kidney Injury (AKI) progression from KDIGO Stage 1 to Stage ≥2 in the ICU is associated with increased mortality, need for renal replacement therapy, and prolonged stay. Most existing predictive models are limited in handling heterogeneous multivariate temporal data, and real-time interpretability, hindering real-time clinical decision-making. We present a framework for \emph{continuous} prediction of AKI progression that employs temporal abstraction, Time-Interval-Related Pattern (TIRP) mining, and the Fully Continuous Prediction Model (FCPM) to deliver continuously interpretable risk estimates. Using state temporal abstraction, raw multivariate clinical temporal data are transformed into Symbolic Time Intervals (STIs) series from which frequent TIRPs ending with the AKI progression event are mined. FCPM learns a model that consists of the mined patterns, which then estimates in real-time their completion probability as new clinical data arrives, leveraging the time-duration distributions when the pattern is unfolding. We further introduce Prolonged Temporal Discretization (PTD), a supervised temporal state abstraction method that chooses cutoffs whose STI time durations are longer in the pre-event data compared to admissions data without the event. A rigorous evaluation on a cohort of 1,343 ICU patients shows that FCPM combined with PTD achieves an AUC-ROC of 0.838 and a mean lead time of 5.8 hours, significantly outperforming LSTM, ResNet, TFT, and XGBoost baselines in both discrimination and earliness.
Explainable Machine Learning for Fall Risk and Post-Fall Mortality Prediction in Nursing Home Residents Using Autoencoder-Based Synthetic Data Augmentation
ABSTRACT. Falls in institutionalized populations represent a major clinical and operational challenge, motivating decision-support tools for risk stratification and outcome prediction. This study presents a machine learning pipeline to predict post-fall mortality and related fall outcomes in nursing home residents using routinely collected structured variables, including demographic, mobility, and functional assessment data. Data from 2009 to 2022 were organized into task-specific datasets comprising a falls-only cohort of 1,368 records and a mixed cohort of 1,445 cases. Preprocessing involved one-hot encoding, MinMax scaling, and stratified 80/20 splits. To address severe class imbalance in mortality prediction, autoencoder-based data augmentation was applied, followed by grid-search model selection and SHAP-based interpretability. Fall occurrence prediction achieved the highest performance, with Gradient Boosting reaching a Macro-F1 of 0.89. Post-fall mortality prediction remained challenging due to extreme imbalance, yielding a Macro-F1 of 0.48, while the multiclass fall-count severity task achieved a Macro-F1 of 0.46. These findings demonstrate the potential of explainable machine learning for fall-related risk stratification in nursing homes while emphasizing the need for larger datasets, richer clinical variables, and external validation to improve rare-event prediction and support real-world implementation in long-term care settings.
Continuous Falls Prediction Among Care Home Residents
ABSTRACT. Proactive fall prevention in care homes is a major healthcare challenge, currently limited by the difficulty of modeling complex real-world data. This study leverages a unique, large-scale datasbase collected via a mobile care monitoring application, capturing the daily living activities of over 140,000 residents in multiple UK care homes. To utilize these irregular and heterogeneous event streams effectively, we propose a continuous prediction framework that transforms raw records into symbolic time intervals using temporal abstraction and mines frequent Time Interval Related Patterns (TIRPs). The Fully Continuous Prediction Model (FCPM) models TIRPs that end with a fall, so in real time based on the unfolding patterns it estimates fall probability the completion of these patterns, meaning that the fall will occur. In this study we propose two enhancements for the FCPM, the use of three general temporal relations, and a supervised state abstraction method. A rigorous evaluation on a large real life database, shows that the use of the three temporal reletions perfoms significnatly better than Allen's seven relations, and the use of the TD4C abstraction performs better than EWD, EFD, and SAX. Finally, the FCPM performs better in comparison to the baseline models, including the deep learning sequence based models (LSTM-FCN, ResNet, TFT) and the feature-based classifiers (XGBoost, ROCKET).
Q2VA: A Bayesian Adaptive Method for Visual Acuity Assessment with Tumbling E Stimuli
ABSTRACT. Visual Acuity (VA) serves as the cornerstone of functional visual assessment and is critical for the diagnosis and monitoring of ophthalmic pathologies. While traditional tools like the Snellen and ETDRS charts are widely used, they face inherent limitations in balancing testing efficiency with measurement precision. Furthermore, current adaptive digital systems predominantly utilize Latin letter optotypes, creating significant cognitive barriers for specific Chinese demographics, particularly pediatric populations and illiterate adults. To bridge this gap, this study introduces Q2VA, a novel digital visual acuity testing module tailored specifically for the Chinese population, employing the “Tumbling E” optotype to eliminate linguistic bias and integrating a Bayesian adaptive algorithm with an active learning strategy to optimize assessment. The system dynamically estimates both the visual acuity threshold and the selection of stimuli. Validation through Monte Carlo simulations and psychophysical experimental results demonstrates that Q2VA can achieve high accuracy and precision of measurement within just 15 trials. The system maintains a low standard deviation (0.02 LogMAR) and high sensitivity to 0.03 LogMAR changes. Q2VA has the potential to provide a robust, efficient, and population-inclusive solution for high-precision visual acuity monitoring in eye clinics.
A Case-Based and Clustering Framework for Diverse Counterfactual Explanations
ABSTRACT. Counterfactual explanations are a promising approach for interpreting predictive models in healthcare. However, most existing counterfactual methods produce explanations with limited diversity and weak population-level coherence. This paper addresses this limitation by investigating whether organizing past clinical cases into clinically coherent clusters can improve the diversity of counterfactual explanations. We propose a clustering-aware, case-based framework that integrates contrastive retrieval through Nearest Unlike Neighbors with cluster-level organization and representative case selection to guide counterfactual generation. Feature modifications are driven by SHAP-based attributions and constrained to preserve clinical plausibility and validity. Experimental results on a Chronic Kidney Disease dataset show that clustering-driven case selection increases counterfactual diversity compared to other methods. These findings suggest that population-aware, case and cluster-based organization constitutes a relevant mechanism for enhancing the expressiveness of counterfactual explanations in healthcare predictive models.
Machine learning to analyze the factors catalyzing coercive practices in Psychiatry
ABSTRACT. This study investigates the factors influencing the application and intensity of coercive practices in psychiatric inpatient care, focusing on two French hospitals: CHU de Saint-Etienne and CH Drome Vivarais. Moving beyond the traditional binary prediction of “coercion vs. no coercion,” we introduce a novel coercion intensity score based on the frequency and type of restrictive measures during hospitalization. Twenty classification and several regression algorithms were evaluated, with XGBoost achieving the best classification performance and excels in intensity prediction. The models, while showing moderate performance comparable to the literature, identified key predictors including legal care modality, number of diagnoses, referral source, and patient age. Younger age, aggression, specific psychiatric diagnoses, and suicidal ideation were associated with higher coercion
intensity.
ASPEN: Spectral-Temporal Fusion for Cross-Subject Brain Decoding
ABSTRACT. Cross-subject generalization in EEG-based brain-computer interfaces (BCIs) remains challenging due to individual variability in neural signals. We investigate whether spectral representations offer more stable features for cross-subject transfer than temporal waveforms. Through correlation analyses across three EEG paradigms (SSVEP, P300, and Motor Imagery), we find that spectral features exhibit consistently higher cross-subject similarity than temporal signals. Motivated by this observation, we introduce ASPEN, a hybrid architecture that combines spectral and temporal feature streams via multiplicative fusion, requiring cross-modal agreement for features to propagate. Experiments across six benchmark datasets reveal that ASPEN is able to dynamically achieve the optimal spectral-temporal balance depending on the paradigm. ASPEN achieves the best unseen-subject accuracy on three of six datasets and competitive performance on others, demonstrating that multiplicative multimodal fusion enables effective cross-subject generalization.
Characterization of Hippocampal Local Field Potentials Using Lyapunov Exponent Analysis and Machine Learning
ABSTRACT. This study examines the application of Lyapunov exponent analysis (LE) to characterize local field potentials (LFPs) from hippocampal brain slices in animal models, with a focus on distinguishing between basal and active states of hippocampal activity. We used LFP recordings obtained from hippocampal slices treated with kainic acid to induce active states, capturing transitions and sustained periods of activity. The signals were pre-processed to standardize their length and filtered using a 4th-order Butterworth bandpass filter to isolate gamma oscillations. LE analysis was used to assess the dynamical behavior of these signals, revealing that positive LE values indicate chaotic dynamics, which were prevalent in the active state recordings. Further analysis using time-series clustering distinguished patterns in progression from the basal to active states, suggesting that LE could serve as a biomarker for neurophysiological and pathological conditions, including Alzheimer's disease. Our findings suggest that LE analysis provides a novel approach to understanding the complex dynamics of the hippocampus, potentially contributing to the early diagnosis of neurodegenerative diseases.
Evaluation of fMRIPrep for Preprocessing 7 Tesla Functional MRI Data
ABSTRACT. Functional magnetic resonance imaging (fMRI) is a key tool for investigating spontaneous brain activity and functional brain organization. However, the reliability of downstream analyses critically depends on preprocessing quality, particularly in high-field acquisitions where susceptibility distortions, motion artifacts, and spatial misalignment may affect data interpretation.
This study evaluates the robustness of the fMRIPrep pipeline for preprocessing 7 Tesla fMRI data, with a particular focus on the reliability of its outputs. We propose a systematic validation framework specifically designed for ultra-high-field imaging, enabling a comprehensive and reproducible assessment of preprocessing performance. The framework is based on a stepwise analysis of pipeline outputs, including final derivatives, intermediate data, and quantitative metrics. It provides a structured evaluation of key preprocessing stages—motion correction, susceptibility distortion correction, anatomical–functional alignment, and spatial normalization to the MNI152 standard space—combining visual inspection, quantitative criteria, and expert assessment. This approach ensures a detailed validation of anatomical integrity, spatial consistency, and BOLD signal quality.
Results demonstrate that fMRIPrep effectively processes 7 Tesla fMRI data, producing reliable outputs with accurate functional–anatomical alignment and robust spatial normalization. The proposed framework highlights the consistency and quality of preprocessing across datasets. This work provides a transparent and reproducible evaluation strategy and establishes a methodological foundation for ultra-high-field fMRI analyses.
Chirplet Analysis of Brain State During Snow and Cold Exposure
ABSTRACT. This paper presents an exploratory chirplet-based signal-processing study of wearable EEG recorded during outdoor cold-plunges in snow. It is based on the Muse EEG headbands with inertial measurements from the IMU (Inertial Measurement Unit) that is built into the Muse product. Because mobile EEG is strongly affected by motion artifacts, tri-axial gyroscope magnitude was used to reject high-motion intervals before chirplet-based time–frequency analysis. We analyze device-derived beta-band EEG estimates using Gaussian-chirplets and compare the resulting structure with conventional stationary Fourier-style representations. The aim is not to infer cognitive state or physiological cold response, but to evaluate whether the chirplet transform provides a compact representation of non-stationary wearable EEG segments recorded in a challenging outdoor environment. Results suggest that chirplet-based analysis reveals localized time–frequency structure.
State-of-Study in Hyperbaric Oxygen Therapy: Wearable EEG and fNIRS During a Single HBOT Session
ABSTRACT. Hyperbaric oxygen therapy (HBOT) exposes the body to elevated partial pressures of oxygen inside a pressurized chamber, a setting that may influence cerebral hemodynamics and neural oscillatory activity. We report a single-participant feasibility study using two Muse-class wearable headbands (Muse 2 for pre-session baseline; Muse S Athena for in-chamber recording) to simultaneously capture 4-channel EEG (256 Hz) and, on the Athena device, 8-channel near-infrared spectroscopy (fNIRS) at 730 nm and 850 nm. Using a reproducible pipeline (0.5–30 Hz bandpass, 5 s epoching, amplitude-based artifact rejection, Welch PSD), we compare spectral features before and during HBOT. Pooled theta/beta ratio (TBR) increased from 2.39 ± 2.23 (316 epochs, pre-session) to 3.56±3.64 (383 epochs, in-chamber), a relative increase of 49%. Frontal alpha asymmetry (FAA) shifted from +1.22 to near zero. During the HBOT session, fNIRS-derived hemodynamic estimates at inner optode positions showed positive ∆HbO and negative ∆HbR, consistent with increased cortical oxygenation. Epoch-level correlation analysis revealed a significant positive association between TBR and ∆HbO at both left-inner (r = 0.16, p = 0.002) and right-inner (r = 0.14, p = 0.008) positions. These results are preliminary and demonstrate feasibility of concurrent wearable EEG + fNIRS monitoring during HBOT, along with a reusable analysis scaffold for future controlled studies.
Speech Biomarkers for mTBI Screening: Effects of Sports-Field Noise and Speech Enhancement
ABSTRACT. Speech-based sideline screening for mild traumatic brain injury (mTBI), known as concussion, in sports-field environments is often contaminated by intense background noise, such as crowd cheering, whistles, and stadium announcements. This noise could compromise the reliability of speech-based screening. In this study, we present a controlled feature-level analysis of acoustic distortion induced by real-world sports-field noise and its variation across various enhancement strategies. Using clean audio-visual speech datasets collected from healthy speakers, we synthesise noisy speech by mixing several types of real sports-field noise at multiple signal-to-noise ratios (SNRs). From clean, noisy, and enhanced signals, we extract a set of
evocative features commonly used in the field research and quantify noise-induced and enhancement-induced distortions at the individual-feature and feature-vector levels. The proposed analysis provides a principled basis and offers methodological foundations for assessing the robustness of speech analysis in noisy clinical and field-deployment scenarios.
Conformal Left Ventricular Mapping for Visualization of Cardiac Conduction
ABSTRACT. Current visualization methods of cardiac electrical activity in the human ventricles depend on multiplanar reformats (short-axis and long-axis views) and three-dimensional volume rendering, making spatial navigation and orientation inherently complex, and increasing the potential of hindering critical anatomical and temporal patterns, such as those of re-entry. This study introduces a novel approach to flatten the geometry of the left ventricle, using disk harmonic mapping, to improve the exploration and analysis of such data, by creating tailored projections of the cardiac transmembrane potential onto a common reference domain inspired by the standardized polar reference model proposed by the American Heart Association. The proposed mapping technique is evaluated using angle and area distortion metrics and proved to be quasiconformal. Disk harmonic mapping presents a valuable addition to the visualization of the cardiac electrical activity patterns of the left ventricle, and it enables comparisons of datasets derived across different studies.
Interpretable Convolutional Neural Networks and Transfer Learning for the Chronological Age Assessment of Drosophila Intestinal Tissue using Confocal Microscopy Images
ABSTRACT. Aging leads to physiological decline and increased disease susceptibility, including cancer. The fruit fly Drosophila melanogaster serves as a powerful model to assess aging due to its short lifespan and conserved molecular pathways. The objective of this study was to implement interpretable convolutional neural networks (CNNs) and use transfer learning to assess the chronological age of Drosophila intestinal tissue from confocal microscopy images. A dataset of 451 flies was acquired, and images were preprocessed, trained, and evaluated using (a) a custom CNN architecture trained from scratch, and (b) three well-established CNN architectures, ResNet50V2, MobileNetV2, and InceptionV3, pre-trained on the ImageNet dataset. Single- (Y) and multi-channel (RGB) images were used as input to these models. Grad-CAM++ heatmaps were generated to interpret the decision-making process of the best-performing models. The results showed that CNNs can distinguish young (≤10 days) from old (≥40 days) intestinal tissues, with MobileNetV2 and InceptionV3 achieving the highest evaluation accuracy (0.86) on multi-channel images. The interpretability findings suggested that InceptionV3 may provide greater reliability for accurate localization in biologically relevant regions. This study underscores the value of integrating advanced computational analysis into biological research.
Dispersion Belief Permutation Entropy: A Novel Uncertainty-Aware Symbolic Approach for Complexity Analysis of Motor Unit Interspike Intervals and EMG Signals
ABSTRACT. Motor Units (MUs) compose the muscular structure and are the functional units of the motor command. For a MU, spikes correspond to motor neuron discharges that generate MU action potentials (MUAPs) captured by electromyography (EMG). Identifying MU activity from EMG is challenging because it requires both accurately assigning detected spikes to their corresponding MU and determining the number of active MUs. This process produces sequences of interspike intervals (ISIs)— the time between two successive spikes of the same MU— which are worth studying with dedicated complexity measures.
This paper proposes an entropy-based complexity measure, termed Dispersion Belief Permutation Entropy (DBPE), designed to capture subtle temporal patterns and uncertainty in ISI sequences and EMG. DBPE integrates dispersion-based symbolic dynamics with belief function theory to enhance sensitivity to hidden structural variations and robustness to noise. We investigate the sensitivity of DBPE to variations in MU recruitment and contraction-related changes in discharge patterns on a dataset comprising 73 MUs.
%
The results demonstrate that DBPE provides superior sensitivity to changes in discharge dynamics wrt aleardy established Dispersion Entropy (DE) and Belief Permutation Entropy (BPE). The proposed DBPE also offers greater sensitivity in discriminating between an increasing number of MUs, thus providing a useful tool for estimating the number of active MUs when fewer than five MUs are involved. These findings highlight DBPE as a powerful tool for quantitative characterization of neural discharge dynamics.
Game-Based Upper Limb Rehabilitation: A Motivational Approach To Stroke Recovery
ABSTRACT. Stroke remains a major contributor to long-term disability and has prompted growing interest in serious games as engaging tools to support rehabilitation and, consequently, improve patient outcomes. Hence, this work presents the design and development of a therapist-driven rehabilitation platform that integrates motion tracking, games with progressive difficulty levels, a clinically inspired assessment test, and accessible interfaces. Combining principles of rehabilitation science and serious game design, the system enables therapists to monitor patient performance in real-time. Built in Unity3D and interfaced with the Microsoft Kinect V2 sensor, the platform employs visual and auditory feedback to reinforce correct execution and sustain motivation. A clinically relevant assessment test, inspired by the Fugl-Meyer Assessment (FMA), was developed to evaluate the motor function of post-stroke patients using clinically meaningful tasks rather than kinematic metrics extracted during gameplay, providing a reliable method for tracking patient progression over time. Also, three interactive prototypes: Window Cleaning, Object Catch, and Capsule were developed to target upper limb mobility, coordination, and reaction time. To evaluate the game quality, 14 participants performed the FMA-inspired assessment test, played all three prototypes and completed a questionnaire about their gameplay experience. The results showed that although the participants felt that the assessment test was an interesting tool to quantitatively track the person's motor evolution, the inaccuracies of the pose estimation method led to unreliable and inconsistent metric results. Regarding the games, the Capsule was the most preferred; however, in general, the participants felt very motivated while playing all the games. Nevertheless, there is still room for improvement, particularly in enhancing the pose tracking accuracy and implementing adaptive game progression in which level difficulty dynamically adjusts according to the player’s real-time performance.
Gaze-Derived Physiological Metrics Across Video Game Genres: A Controlled Experimental Study
ABSTRACT. Context and Motivation: Videogame addiction (VA), or Gaming Disorder, is recognized by the WHO in the ICD-11 and the APA in the DSM-5 as a significant mental health condition. Despite its clinical recognition, current diagnosis relies heavily on subjective self-report scales and questionnaires, such as the Game Addiction Scale (GAS), which necessitates direct intervention from health professionals. There is a critical need for innovative digital tools that provide objective scientific evidence to complement traditional psychological assessments. Purpose and Hypothesis: The ADICVIDEO project investigates the relationship between VA and objective patterns of emotional state, sleep quality, and fatigue. The central hypothesis posits that the degree of addiction can be quantified using real-time physiological and physical activity sensors, integrated with Machine Learning (ML) classifiers to identify risk patterns and biomarkers.
Methodology: The research targets emerging adults (aged 18–30) within the university community. The study utilizes a multimodal approach across six sensorized stations equipped with Empatica E4 (physiological signals), Tobii Pro Glasses 3 (eye tracking), and Logitech Brio 4K cameras (facial recognition/FaceReader). Two core research protocols have been successfully implemented: the Transversal Study (Protocol PEIBA 1868-N-23), which focuses on initial usage habits and psychological well-being (n=440), and the Experimental Sessions (Protocol SICEIA 2024-1858), which involve synchronized biometric recording during gaming and sleep. Current Status and Progress: To date, the transversal phase is 100% complete, while experimental data collection stands at 75% (n=101). Key outcomes include the publication of player profiles in Computers & Education Open (2025) and the release of the PaGER-Sync ADICVIDEO dataset in Data in Brief (2025). Preliminary ML models (SVM, Random Forest) for emotion and fatigue classification are currently under development. Furthermore, clinical validation through a Delphi Expert Panel with psychiatrists and psychologists has been conducted to validate clinical dimensions of the ADICVIDEO model (Protocol SICEIA 2025-3504). Expected Impact and Future Horizon: The project aims to deliver diagnostic support tools for practitioners and early-warning systems for users. Future efforts are oriented toward P4 Medicine (Predictive, Preventive, Personalized, Participatory), utilizing explainable AI and biofeedback interfaces to enhance digital well-being and emotional self-management.
Building eHealth Professional Profiles for Interoperable Digital Health Services in the EHDS Era: Evidence from Multi-Sector Interoperability Projects
ABSTRACT. Abstract — The digital transformation of healthcare systems in Europe requires professionals who can design, manage, and evaluate interoperable health information infrastructures. This paper analyzes how healthcare professionals enrolled in the
European master’s program ManagiDiTH (Managing the Digital Transformation in Healthcare) perceive interoperability implementation in their countries. Drawing on 72 student case studies from the course “Technologies in Interoperable
Ecosystems,” we conduct a qualitative thematic analysis of interoperability experiences across Greece, Finland, Portugal, and cross-border settings. Using the General Conceptual Framework of eHealth Profiles and Competences, as the
analytical lens, we map student-identified themes to the framework’s three profile domains (Health, Non-Health, ICT), the six phases of the eHealth service lifecycle (Plan, Build, Run, Enable, Manage, Use), and the competency decomposition into
knowledge, skills, and attitudes. The findings reveal that students consistently identify competency needs spanning all three domains and all lifecycle phases, with particular emphasis on the Enable and Manage phases. Challenges reported by students—including legacy system integration, digital skills gaps, and data privacy concerns—map directly to specific competency areas defined in the framework.
Scaling PACS Beyond Local Storage: A Unified Architecture for DICOM Metadata and Pixel Persistence in Distributed Environments
ABSTRACT. Traditional centralized PACS architectures exhibit significant limitations under WAN deployments, particularly regarding latency and scalability. Although NoSQL databases have been explored to improve horizontal scalability, prior approaches typically separate metadata indexing from binary object storage, relying on distinct technologies and persistence models. Chunking mechanisms have also been used to handle large medical imaging objects, but mostly as technical workarounds rather than architectural design parameters. In contrast, this work proposes a unified, chunk-aware distributed persistence architecture where both DICOM metadata and image objects are managed within the same NoSQL infrastructure, and assesses its retrieval under realistic WAN conditions.
A AI-driven document builder for standardized clinical studies
ABSTRACT. Standard Operating Procedures are essential instruments for ensuring consistency, regulatory compliance, and data quality in clinical research. Despite their recognized importance, SOP authoring remains a largely manual, unstructured process with limited tooling for systematic, multi-centre creation.
This paper proposes WizarDoc, a modular, AI-assisted questionnaire-driven web application designed to streamline the production of SOP-aligned clinical study documentation. WizarDoc combines reusable template pools (governed question sets plus DOCX layouts) with a step-by-step wizard that dynamically generates contextual questions and AI-driven answer suggestions via a Retrieval-Augmented Generation (RAG) pipeline grounded in authoritative clinical standards. Captured identifier-keyed responses are exported as populated DOCX documents. The tool is publicly available at https://github.com/ieeta-mith/wizardoc.
An Ensemble-Based Approach to Validating AI-Generated Clinical Discharge Summaries
ABSTRACT. The automated generation of medical discharge summaries using large language models (LLMs) promises efficiency gains for clinical documentation. However, factual inaccuracies, omissions of critical information, and confabulated content pose significant risks to patient safety and clinical uptake. Therefore, robust validation strategies are required before such systems can be safely deployed in routine care. However, many traditional evaluation metrics either rely on human-generated reference summaries or do not align well with human preferences. This paper aims at closing this gap and presents a multi-layered validation framework for AI-generated discharge summaries. We combine concept-based, metric-based, classifier-based, and LLM-based evaluation methods within an ensemble learning approach. The framework considers multilingual clinical environments and supports both real-time validation of individual summaries and longitudinal quality assurance for prompt and model updates. We report results from a proof-of-concept implementation on 21 real-world discharge summaries including human corrections. Our findings highlight the limited reliability of single automated metrics and demonstrate the value of ensemble-based validation aligned with clinician judgment.
Architecture of the new domestic Ankle Brachial Index system using Pulse Wave Velocity
ABSTRACT. The ankle-brachial index (ABI) is a crucial non-
invasive diagnostic parameter in the diagnosis of peripheral
arterial disease (PAD). The classic techniques used to obtain this
index have several limitations, such as the need to use Doppler
equipment, patient discomfort, and the time required to perform
the test. In recent years, several studies have been conducted with
the aim of mitigating these restrictions, including obtaining this
index based on pulse wave velocity with biomedical signals such
as the combination of ECG and PPG. These systems have been
developed for clinical environments and use wired connections.
This paper proposes an improved version of this previous
methodology, incorporating real-time wireless data transmission
to a mobile device and subsequent storage in a remote database,
from which ABI values are estimated, with the aim of increasing
the portability of the system and improving its applicability in
domestic environments. The importance of home monitoring is
justified by continuous tracking and data collection for the future
development of predictive models. The new system was evaluated
through a comparative study between the traditional method,
the previous wired approach, and the new wireless solution
in a clinical setting with 25 diabetic patients susceptible to
PAD complications. The results show good accuracy, confirming
its usefulness as a home-based alternative for measuring ABI,
although further optimization is required.
Detecting Parkinson’s Disease with Postural Sway via Single IMU and Machine Learning
ABSTRACT. Postural instability is a major contributor to fall risk in Parkinson’s disease (PD), yet early balance impairments are often not detected by standard clinical assessments. This study examines whether simple standing balance tasks, assessed using a single wearable inertial measurement unit (IMU), can effectively differentiate individuals with PD from age-matched healthy controls (HC). Trunk acceleration data were collected during four standing conditions: quiet standing on firm ground, standing on foam, and both conditions combined with a cognitive dual task. A broad set of time-domain, frequency-domain, and nonlinear postural sway features were extracted. Statistical analysis revealed significant group differences primarily during quiet standing on firm ground, where participants with PD showed increased sway path length, higher sway velocity, and greater jerk, indicating reduced smoothness of postural control. Machine learning classification using k-nearest neighbours (56\% accuracy), support vector machines (62\% accuracy), and decision trees (74\% accuracy) demonstrated moderate baseline performance with the full feature set. After applying Joint Mutual Information feature selection, classification performance improved, with the decision tree achieving 82\% accuracy. These results indicate that a brief, low-burden standing test using a single IMU, combined with targeted feature selection and machine learning, can sensitively detect PD-related postural instability and may serve as a practical screening or monitoring tool.
Frozen External Validation of an Interpretable k-TSP Panel for Colorectal Adenoma Across Independent GEO Cohorts
ABSTRACT. Biomarker studies in small-sample transcriptomic settings often rely on internal resampling stability as a proxy for robustness, but this does not always predict performance on truly independent data. In this study, we examined that gap using a leakage-controlled, frozen k-Top Scoring Pairs (k-TSP) frame- work for colorectal adenoma versus normal classification across public Gene Expression Omnibus cohorts. Model development was restricted to a single training cohort, where features were defined, candidate gene pairs were ranked, and cross-validation recurrence was measured. We then evaluated both a compact 3- pair panel and a broader 15-pair ensemble on three external cohorts without retraining. Both models generalized strongly across independent datasets, with the ensemble showing the most consistent overall performance. Notably, one pair with only moderate recurrence during cross-validation still demonstrated meaningful external utility, indicating that internal recurrence alone may not fully capture downstream value. Orthogonal methylation analysis provided additional biological support for all six genes in the compact panel, with especially strong evidence for TCN1 and GUCA2B. Permutation testing further supported the robustness of these findings. Overall, our results suggest that compact, interpretable rank-based signatures can generalize well under a strict frozen-evaluation design, and that cross-validation recurrence should be treated as an informative but incomplete indicator of external usefulness.
Cross-Cancer Computational Framework for Immune-Dysregulated Ecosystem Discovery and Therapeutic Prioritization
ABSTRACT. Understanding why tumors evade immune attack remains a central challenge in cancer research. We present a cross-cancer computational framework for discovering immune-dysregulated “traitor” cell states and linking them to actionable therapeutic targets. The framework integrates batch-aware representation learning with unsupervised manifold refinement to stabilize neighborhood geometry and identify ecosystem-specific resistance programs across single-cell cohorts from lung, colon, pancreatic, and melanoma tumors. Across cancers, refinement improved embedding compactness and preserved biologically coherent structure relative to baseline embeddings. Distinct resistance ecosystems emerged, including inflammatory myeloid programs in lung and colon, desmoplastic fibroblast hubs in pancreas, and cytotoxic interferon-enriched T-cell states in melanoma. Ecosystem-informed therapeutic prioritization highlighted actionable intervention points, including CXCR4-mediated stromal exclusion, CSF1R-driven macrophage survival, and IL6-associated fibro-inflammatory signaling. Expression-matched permutation testing and ligand–receptor interaction analysis supported the biological specificity of the discovered programs. These results demonstrate a generalizable, data-driven strategy for mapping dysfunctional immune ecosystems and accelerating cross-cancer therapeutic design.
Feature Fusion and Disfluency for Automated Anxiety Detection in Clinical Interviews
ABSTRACT. Traditional speech-based anxiety detection has focused on acoustic prosody, such as pitch, energy, and voice quality. However, psycholinguistic research indicates that anxiety more directly affects cognitive processes involved in speech planning and lexical retrieval. To examine this discrepancy, we compared linguistic disfluency features, such as filled pauses and silence patterns, with acoustic prosodic features. The method can support practitioners in their clinical interviews as a decision-support tool for more objective measures of anxiety. Using 163 participants from the DAIC-WOZ dataset, we trained Random Forest classifiers under stratified 5-fold cross-validation. We evaluated the model’s performance across three settings:
acoustic-only (n=5), disfluency-only (n=9), and combined fusion (n=14). The disfluency-only model significantly outperformed the acoustic baseline (F1=0.78 versus 0.67, p=0.008, Cohen’s d=2.18). Filled pause rate emerged as the strongest predictor (28.7% importance), with anxious participants exhibiting 74% higher rate. This research provides evidence that linguistic disfluency features are better speech biomarkers for detecting anxiety.
Deep Learning for Automated Quality Assessment of NK Cell Differentiation in iPSC Cultures
ABSTRACT. Induced pluripotent stem cell (iPSC)-derived natu
ral killer (NK) cells offer a promising, scalable platform for next
generation immunotherapy manufacturing. However, variability
in differentiation efficiency across biological batches makes early
quality assessment challenging. Because NK maturation requires
several weeks, failures detected at late stages can lead to substan
tial losses of time, cost, and experimental resources. To address
this, we propose a deep learning (DL) framework for automated,
early-stage, and non-destructive assessment of NK maturation
status using bright-field microscopy images of Day-10 cultures.
Our approach leverages a domain-specific preprocessing pipeline
and modern DL architectures to capture subtle morphological
indicators of hematopoietic commitment. The best-performing
model (ResNet18) achieved an accuracy of 96.25%, with t-SNE
and Grad-CAM analyses confirming that the network learns
biologically relevant structural patterns. By providing early
predictive insights through a label-free workflow, this framework
offers a scalable strategy to standardize quality control in iPSC
NK cell manufacturing.
Optimized metagenomic classification through hybrid computational approaches
ABSTRACT. Metagenomics has transformed microbial community analysis, yet challenges persist in computational efficiency, tool accessibility, and benchmarking standards. We present a metagenomic analysis pipeline that performs taxonomic identification in two phases. First, it optionally processes raw sequencing data through a preprocessing cascade including quality control, host DNA depletion, metagenomic assembly and binning. For the core analysis, it combines k-mer screening with an alignment system that switches between high-precision and fast tools depending on the genomic characteristics of the sample, followed by taxonomic assignment using exact matches and weighted lowest common ancestor strategies. Results show F1 scores above 0.80 across all biological domains and taxonomic levels, maintaining robust performance up to 30% sequence divergence. Validation on 30 Human Microbiome Project samples confirms accurate profiling of complex microbial communities (80.3% sensitivity, 20.5% MAPE) while detecting previously unreported taxa. The pipeline processes most datasets in under 60 minutes with a memory footprint of 2.82 GB.
Toward Auditable Neuro-Symbolic Reasoning in Pathology: SQL as an Explicit Trace of Evidence
ABSTRACT. Automated pathology image analysis is central to clinical diagnosis, but clinicians still ask which slide features drive a model’s decision and why. Vision–language models can produce natural language explanations, but these are often correlational and lack verifiable evidence. In this paper, we introduce an SQL-centered agentic framework that enables both feature measurement and reasoning to be auditable. Specifically, after extracting human-interpretable cellular features, Feature Reasoning Agents compose and execute SQL queries over feature tables to aggregate visual evidence into quantitative findings. A Knowledge Comparison Agent then evaluates these findings against established pathological knowledge, mirroring how pathologists justify diagnoses from measurable observations. Extensive experiments evaluated on two pathology visual question answering datasets demonstrate our method improves interpretability and decision traceability while producing executable SQL traces that link cellular measurements to diagnostic conclusions.
Enhancing Brazilian Portuguese Small Visual Language Models Through Progressive Learning for Torax X-Ray Analysis
ABSTRACT. Vision-Language Models (VLMs) have recently demonstrated strong multimodal reasoning capabilities in medical imaging. However, most state-of-the-art systems are large-scale, proprietary, and predominantly trained on English corpora, limiting their applicability in low-resource clinical language settings such as Brazilian Portuguese. This work introduces three lightweight Vision-Language Models in Brazilian Portuguese for chest X-ray analysis, namely BodeVLM-RX, InternBodeVL-RX, and BodeMedRX, developed under a progressive learning paradigm and fine-tuned using the proposed Bode-CXR-VQA dataset, a translated and curated Portuguese adaptation of a large-scale radiology visual question-answering benchmark. The models are built upon distinct architectural foundations, including general-purpose and domain-specialized medical VLMs, enabling a controlled comparison of adaptation efficiency across different pretraining distributions. Experimental results evaluate the impact of scaling domain-specific supervision from 1k to 50k samples on closed-ended and open-ended clinical VQA tasks using Exact Match, BLEU, and ROUGE-L metrics, demonstrating consistent performance gains across all architectures, with markedly different scaling behaviors.
Active-CLIP: Zero-Shot Active Learning with Visual Pseudo-Label Propagation for Efficient Med-VQA
ABSTRACT. Background and Objective: Medical Visual Question Answering (Med-VQA) demands accurate multimodal interpretation of images and clinical questions, yet is hindered by data scarcity and high annotation costs. Vision-Language Models (VLMs) like CLIP offer strong zero-shot capabilities, but suffer from poor calibration and hallucinations in medical domains. This study introduces Active-CLIP, a hybrid framework combining zero-shot active learning with semi-supervised pseudo-label propagation to maximize annotation efficiency and performance in low-data regimes without fine-tuning the foundation model. Methods: Using frozen CLIP (ViT-B/32) as a zero-shot feature extractor and uncertainty estimator, Active-CLIP selects informative samples via Shannon entropy and propagates high-confidence pseudo-labels to the remaining pool using visual similarity in CLIP embedding space. A multimodal transformer is trained from scratch on the enriched set and evaluated on four public Med-VQA datasets (VQA-RAD, Path-VQA, Omni-Med-VQA-Mini, SLAKE) across labeling budgets of 10%–90%, compared against a real-labels-only baseline. Results: Active-CLIP consistently outperforms the baseline, with the largest gains at low ratios (10%–30%): up to +13% in BLEU-4 (Path-VQA) and +41% in ROUGE-L (SLAKE), alongside improved semantic alignment (higher AS, lower AE). Conclusions: Active-CLIP offers a promising, annotation-efficient solution for Med-VQA in data-scarce settings by leveraging zero-shot exploration and reliable pseudo-label exploitation.
Cross-Modality Distillation for Multiple Sclerosis Lesion Segmentation in MRI
ABSTRACT. Accurate segmentation of multiple sclerosis (MS) lesions from magnetic resonance imaging (MRI) is essential for assessing disease activity and progression. Among MRI sequences, FLAIR provides high lesion contrast, while T1-weighted images, though routinely acquired, encode lesion-related information in a less explicit manner. The difference in how lesion information is represented across these modalities makes MS lesion segmentation a well-established multimodal task to study cross-modal knowledge transfer. In this work, we investigate whether knowledge distillation inspired by the Learning Using Privileged Information (LUPI) paradigm can effectively transfer lesion-related information from a more informative modality (FLAIR) to a less informative one (T1-weighted). Our objective is to systematically assess the effectiveness of distillation in a controlled and clinically meaningful setting. We propose a teacher–student framework in which a teacher network is trained to segment lesions from FLAIR and T1-weighted images and guides a student trained exclusively on T1-weighted data. This setup enables the student to internalize lesion-specific representations that are not directly apparent in T1 contrast alone, providing a principled framework to evaluate cross-modal knowledge transfer in MS imaging. The proposed method is evaluated on two datasets, a public benchmark (ISBI 2015) and a clinical dataset from San Martino Hospital (Genoa, Italy), serving as a controlled experimental framework to quantify the benefits and limitations of LUPI-based distillation strategies in MS imaging.
Align-cDAE: Alzheimer’s Disease Progression Modeling with Attention-Aligned Conditional Diffusion Auto-Encoder
ABSTRACT. Generative AI framework-based modeling and prediction of longitudinal human brain images offer an efficient mechanism to track neurodegenerative progression essential for the assessment of diseases like Alzheimer’s. Among the existing generative approaches, recent diffusion-based models have emerged as an effective alternative to generate disease progression images. Incorporating multi-modal and non-imaging attributes as conditional information into diffusion frameworks has been shown to improve controllability during such generations. However, existing methods do not explicitly ensure that information from non-imaging conditioning modalities is meaningfully aligned with image features to introduce desirable changes in the generated images, such as modulation of progression-specific regions. Further, more precise control over the generation process can be achieved by introducing progression-relevant structure into the internal representations of the model, lacking in the existing approaches. To address these limitations, we propose a diffusion auto-encoder-based framework for disease progression modeling that explicitly enforces alignment between different modalities. The alignment is enforced by introducing an explicit objective function that enables the model to focus on the regions exhibiting progression-related changes. Further, we devise a mechanism to better structure the latent representational space of the diffusion auto-encoding framework. Specifically, we assign separate latent subspaces for integrating progression-related conditions and retaining subject-specific identity information, allowing better controlled image generation. We have experimentally validated the performance of our model by evaluating on Alzheimer’s disease progression generation through various image similarity metrics and region-wise volumetric assessments. These results demonstrate that enforcing alignment and better structuring of the latent representational space of diffusion auto-encoding framework leads to more anatomically precise modeling of Alzheimer’s disease progression.
Early Validation of a Force-Aware LfD Framework for Robotic Surgery
ABSTRACT. Learning from Demonstration (LfD) facilitates the
transfer of human skills to robots through the analysis of
motion-based demonstrations, but most approaches rely only
on kinematics, limiting their use in surgical applications where
interaction forces are critical. This work proposes a multimodal
LfD framework that integrates position, velocity, orientation,
force and torque data into a Hidden Markov Model (HMM).
Thirty-three demonstrations of a puncture task on a deformable
surface were collected from 11 participants and used to train the
model. The UR3e robot reproduced the learned trajectory with
sub-millimeter accuracy (RMSE = 0.1127 mm) and replicated
key force patterns observed in human demonstrations. Results
demonstrate the feasibility of incorporating force information
into LfD, enhancing trajectory learning and contributing to
the development of intelligent robotic systems for intraoperative
assistance.
GD-HMoE: Foundation Model-Guided Mixture of Experts for Multi-Task Medical Image Analysis
ABSTRACT. In endoscopic image analysis, conventional multi-task learning frameworks based on hard parameter sharing struggle to simultaneously satisfy the divergent requirements of lesion localization and qualitative characterization for feature representation. To address this, we propose GD-HMoE (Geometric-Decoupled Hierarchical Mixture of Experts), a framework driven by general structural prior distillation for joint lesion detection and classification. The proposed method constructs two complementary sub-modules—semantic and geometric experts—on a unified backbone, achieving feature decoupling through a dynamic routing mechanism. Furthermore, a task-oriented hierarchical feature aggregation strategy is designed, where structural features are primarily utilized for the localization task, while fused features support classification decisions. Experimental results on multi-center Inflammatory Bowel Disease (IBD) and iodine-stained Esophageal Cancer (EC) datasets demonstrate that GD-HMoE achieves superior performance in both classification and detection. The framework effectively mitigates negative transfer in multi-task learning and offers a novel structured modeling approach for medical image analysis.
Knowledge Distillation with Hybrid Soft--Label Losses for Imbalanced Lung Nodule Classification
ABSTRACT. Abstract—Knowledge Distillation (KD) is widely used to compress deep neural networks while maintaining high performance. Loss functions help in choosing the best strategy for image classification. This paper investigates hybrid distillation loss applied in KD for lung nodule malignancy classification on the unbalanced LIDC-IDRI dataset. We propose three hybrid distillation loss functions that combine MSE, Focal Loss, and MSE+Focal Loss for comparison with the baseline loss, using softened teacher outputs to mitigate class imbalance inherent in medical scenarios. Five teacher models (ResNet-50, DenseNet-121, ViT-Small, Swin-Transformer, MaxViT-Tiny) and two lightweight students (EfficientNet-B0, MobileNetV2) were evaluated across three temperatures (τ ) in 120 experiments, reducing the students’ parameters by 42 97% relative to their respective teachers. MSE-augmented distillation consistently outperformed the traditional distillation loss function. On the test set, MobileNetV2 achieved the highest gain (+4% AUC) with traditional loss and MSE at τ = 10, while EfficientNet-B0 improved by +1.3% with traditional loss and MSE at τ = 10. These findings indicate that symmetric penalization in probability space improves robustness under class imbalance. The source Code is available at https://github.com/graciellafavoreto/hybrid-kd-losses.
Predictive Uncertainty for Medical Image Synthesis with Continuous-Time Generative Models
ABSTRACT. Continuous time generative models, including diffu- sion and flow-matching models, have shown strong performance in medical image synthesis, yet they produce outputs without any indication of how reliable their predictions are, a critical limitation for clinical applications. We propose an uncertainty-aware approach that augments these models with the ability to estimate predictive uncertainty by design, enabling them to jointly predict the generative update and its associated confidence. Uncertainty is integrated into the training objective and propagated along the sampling trajectory at inference time, yielding spatially resolved, output-aligned uncertainty maps without modifying the underlying generative formulation. We evaluate our approach on two medical image synthesis tasks, low dose CT denoising and T1-to-T2 MRI translation, across both diffusion and flow- matching models. Our method improves generation quality while producing meaningful uncertainty estimates, offering a practical path toward more transparent and trustworthy medical image synthesis.
Hybrid Semantic Augmentation for Cataract Surgery Image Synthesis
ABSTRACT. The performance of deep learning models for surgical scene understanding is strongly limited by the availability of large-scale, diverse, and well-annotated datasets. In medical imaging, data acquisition is costly and constrained by privacy regulations, motivating the use of synthetic data generation. In this work, we investigate semantic mask-driven image synthesis for cataract surgery using both conditional generative adversarial networks and diffusion-based models. We introduce two complementary semantic augmentation strategies that operate directly at the mask level, enabling the generation of anatomically consistent yet diverse surgical scenes and expanding the effective semantic training distribution. Experimental results on the Cataract-1K dataset demonstrate that the proposed augmentation strategies significantly improve image quality and diversity, allowing generative models to overcome early performance saturation.
Our findings highlight the importance of procedural semantic augmentation for scalable synthetic data generation in surgical imaging.
Enhancing Synthetic CT from CBCT via Multimodal Fusion: A Study on the Impact of CBCT Quality and Alignment
ABSTRACT. Cone-Beam Computed Tomography (CBCT) is widely used for real-time intraoperative imaging due to its low radiation dose and high acquisition speed. Despite its high resolution, CBCT suffers from more artifacts and thereby lower visual quality compared to conventional Computed Tomography (CT). A recent approach to mitigate these artifacts is synthetic CT (sCT) generation, translating CBCT volumes into the CT domain. We enhance sCT generation through multimodal learning, integrating intraoperative CBCT with preoperative CT. Beyond validation on two real-world datasets, we use a versatile synthetic dataset, to analyze how CBCT-CT alignment and CBCT quality affect resulting sCT quality. The results show that multimodal sCT consistently outperform unimodal baselines, with the greatest gains observed in well-aligned, low-quality CBCT-CT cases. We also show that these findings are reproducible in real-world clinical datasets.
An Object-Centric Preprocessing Pipeline for Retinal Prosthetic Vision
ABSTRACT. Retinal prosthetic devices like Argus II help patients to partially restore vision by mapping the visual scenes to a spare grid of electrode phosphenes. For effective mapping on such highly sparse grids, image preprocessing becomes essential. This paper presents an object-centric hybrid preprocessing pipeline that addresses the limitations of existing methods through four contributions: (1) a single-object focus strategy combining spatial centrality, detection confidence, object area, and semantic class priority; (2) a GrabCut refinement stack with area; (3) a full perceptual simulation with CLAHE contrast enhancement and configurable electrode dropout, and (4) an open mobile and server deployment architecture to help participants see the expected outputs via virtual reality simulations. Our experiments with five scenarios and 16 participants show that the proposed pipeline leads to an increase in object recognition accuracy from 0% to 62.5%, and that the high-capacity YOLO11x object detector achieves an 80.77% accuracy without post hoc refinement. These results offer pragmatic guidance for deploying preprocessing systems on mobile hardware and establishing a reproducible framework for future vision algorithms.