View: session overviewtalk overview
| 09:00 | FedTwin‑XAI: A Patient‑Owned Federated Digital Twin Framework with Explainable Mobile Medical Imaging and Differentially Private Synthetic Augmentation ABSTRACT. Medical imaging applications are increasingly deployed on mobile devices, yet large‑scale learning is constrained by privacy regulation, data silos, and class imbalance in real‑world collections. In parallel, patient‑centric digital twin architectures aim to enable personalized “what‑if” simulations but are rarely integrated with image‑based screening pipelines and explainability mechanisms. This paper introduces FedTwin‑XAI, a unified patient‑owned framework that integrates: (i) mobile imaging‑based screening with post‑hoc explainability, (ii) decentralized personal data pods for data sovereignty, (iii) federated learning for collaborative training without centralizing raw data, (iv) differential privacy‑aware synthetic augmentation to mitigate scarcity and imbalance, and (v) per‑patient digital twins for longitudinal monitoring and scenario simulation. The framework is positioned for privacy-by-design deployment and aligns with federated and explainable machine vision. We instantiate the imaging component using a VGG16‑based scalp disease classifier with SHAP explanations and report component-level quantitative performance on a 10‑class dataset (13,196 images), achieving 99.84\% accuracy on a labeled validation set on the aggregator node. We then provide a protocol to evaluate the end‑to‑end federated twin pipeline under non‑IID client partitions, including communication–accuracy trade‑offs, calibration, and privacy accounting. The result is an actionable blueprint for privacy‑preserving, explainable medical imaging systems that can be embedded into patient‑centric digital twins for intelligent healthcare. |
| 09:10 | Explaining Visual-Language Foundation Models for Histopathology: a Patch-Level Approach ABSTRACT. Visual–language foundation models have recently become the state of the art in computational histopathology, enabling zero-shot classification and region-level interpretation via text–image similarity. However, it remains unclear whether these models rely on features that are semantically meaningful to human experts at the tile/patch level. In this work, we assess the alignment between model-derived saliency maps and specialist annotations for three visual-language foundation models: CONCH, PathGen, and MUSK. Using the model-agnostic P-IBISA method, we generate attribution maps for histopathology patches from the WSSS4LUAD and BCSS datasets and compare them to ground-truth semantic segmentation masks. Faithfulness is measured using the Confidence Increase metric, while spatial correspondence is evaluated via the DICE score. Results show that P-IBISA saliencies consistently achieve higher faithfulness than ground-truth annotations, indicating that the highlighted regions are more influential to the models’ predictions than human-labeled regions. Additionally, localization analysis reveals low overlap between saliency maps and expert annotations, suggesting that the models rely on features that do not fully correspond to human-interpretable tissue regions. These findings highlight a gap between model reasoning and human understanding, motivating future work toward integrating segmentation-aware regularization into multimodal foundation models for histopathology. |
| 09:20 | Specializing Large Language Models for Hierarchy-Aware ICD-10 Mapping of Portuguese Cardiology Diagnoses PRESENTER: Gustavo Cruz ABSTRACT. Automated ICD-10 coding is a high-impact yet expertise-intensive task requiring precise hierarchical reasoning. We evaluate whether a specialized LLM can approach cardiology specialist-level performance when mapping short Portuguese diagnoses to ICD-10 codes. We introduce (i) a double-specialist benchmark of 381 diagnoses from 89 clinical texts and (ii) a 14,685-pair supervision diagnoses corpus generated by a teacher LLM with structural validation and hierarchical normalization. Across paradigms, Block+Category accuracy improves from 0.5350 (retrieval baseline) to 0.7366 (expanded retrieval), 0.7815 (open frontier model), and 0.9019 (proprietary frontier model). Most residual errors reflect hierarchical near-misses rather than semantic misclassification. Supervised fine-tuning of mid-scale open models achieves 0.8582 accuracy on a stratified test set, approaching frontier performance. Results indicate that ICD-10 coding depends more on hierarchical calibration and domainspecific supervision than model scale alone, supporting compact, clinically deployable assistants for Portuguese cardiology. |
| 09:30 | PRESENTER: Isabel Rio-Torto ABSTRACT. Explainability is essential for the reliable deployment of deep learning models in clinical settings, with most research focusing on post-hoc visual saliency methods. However, these rely on task-specific classifiers, limiting their applicability for analysing vision encoders independently of downstream tasks. Moreover, these methods can only explain the predefined set of classes for which the underlying model was trained. In this work, we investigate how DeViL, a framework that translates visual features into natural language, can be leveraged for understanding representations learned by chest X-ray vision models. DeViL requires no task-specific heads, since it only uses the frozen vision encoder, and it is able to generate open-vocabulary saliency maps because it uses a language model. We adapt DeViL to align visual features with clinically meaningful concepts by training on structured radiology reports and using a radiology-specialised language model. We conduct experiments on structured radiology report generation, saliency generation, and open-vocabulary saliency-text grounding on different types of chest X-ray vision encoders: convolutional, self-supervised Vision Transformer, and vision–language Transformer. Results show competitive performance with large end-to-end report generation models and demonstrate that DeViL's open-vocabulary saliency maps outperform those produced by a specialised saliency generation method for vision-language encoders. |
| 09:40 | Interpretable Hybrid Modeling for Breast Cancer Risk Stratification from Structured Radiology Reports PRESENTER: Juan Salvador Toledo-Rios ABSTRACT. Breast cancer risk stratification based on radiological reports is challenging, especially in non-English-speaking clinical settings. This study proposes a two-stage hybrid framework that combines automatic information extraction using a large language model (LLM) with structured predictive modeling to estimate malignancy risk. A private set of 40,394 anonymized reports in Spanish (2014–2019) was used. The extraction module was based on an LLM adjusted using Low-Rank Adaptation (LoRA) to transform unstructured clinical text into structured representations compatible with a hierarchical BI-RADS scheme. The model achieved an Exact Match of 0.997 and micro-F1 values greater than 0.98 in clinically critical fields. The extracted entities were transformed into tabular variables for subsequent modeling. The binary risk classifier achieved a ROC-AUC of 0.953. In the high-risk subset, the cancer prediction model achieved a ROC-AUC of 0.802. The proposed modular architecture preserves interpretability and clinical traceability, demonstrating the feasibility of integrating LLMs into real-world risk-stratification workflows. |
| 09:50 | PRESENTER: Akeem Temitope Otapo ABSTRACT. Alzheimer's Disease (AD) represents a growing global health crisis characterized by progressive cognitive decline, memory impairment, and irreversible brain atrophy. The heterogeneous and irregular nature of longitudinal clinical data presents significant challenges for accurate disease progression modeling, with existing methods struggling with irregular sampling, effective multimodal integration, and simultaneous prediction of multiple clinical outcomes. To address these limitations, we propose an enhanced Neural Ordinary Differential Equations (Neural ODEs) framework that leverages continuous-time dynamics with fourth-order Runge-Kutta (RK4) integration to model AD progression from irregular longitudinal observations. Our approach incorporates a dual-attention multimodal fusion mechanism for task-specific feature weighting and an adaptive multi-task learning strategy with dynamic task balancing. Evaluated on the OASIS-2 dataset, a comprehensive ablation study across six modality-task configurations shows that our multimodal multi-task framework achieves the best overall performance, with the complete model obtaining a mean diagnosis AUC of 0.842, accuracy of 0.744, MMSE R‑squared of 0.575, CDR R‑squared of 0.138, and atrophy R‑squared of 0.510. Our framework outperforms single-modality baselines with robust convergence and interpretable features, advancing integrated AD progression modeling. |
| 10:00 | ABSTRACT. Lower limb exoskeletons show great potential for rehabilitation of patients with different motor impairments. However, robot neurorehabilitation is still far from being widely used in clinical practice given human-robot interaction limitations. Therefore, the rise of Reinforcement Learning offers an opportunity to tackle the complex problem of human-robot interaction, often avoided by classical position control with a trajectory reference. However, the challenge of transferring the agent from the native programming environment where it has been trained to the real world for inference presents several limitations. This paper proposes a method for externalizing the training environment through ROS, to face the limitations and challenges of an uncoupled environment since the beginning. To prove this concept, a hyper-realist simulator of the Exo-H3 exoskeleton is controlled by the Reinforcement Learning agent. All experiments were conducted using the Exo-H3 Gazebo simulator connected via ROS. Real hardware validation remains future work. |
| 10:10 | Incomplete information retrieval without imputation: exploiting the correlation among deep features ABSTRACT. Similarity-based retrieval over complex data often relies on deep embeddings derived from images, signals, or textual reports. However, in real-world datasets many records contain missing attributes, which makes similarity comparisons difficult. Traditional solutions either discard incomplete records or impute values, both of which may distord the latent representation space and introduce artificial information. In this work, we propose CURIE for retrieving similar records without imputing or deleting data. CURIE models each attribute as a distance space induced by deep embeddings, and estimates correlations between these spaces. During similarity computation, the contribution of missing attributes is redistributed to correlated ones using a weighting mechanism. Experiments across three image-based datasets using multiple deep feature extractors show that CURIE consistently achieves higher retrieval quality than competitors as the amount of missing data increases. The results indicate that exploiting correlations among latent distance spaces is an effective strategy for similarity retrieval over incomplete data. |
| 10:20 | Applying Machine Learning to Predicting Malaria Prevalence: Spatial Analysis Results and Significance PRESENTER: Abimbola Afolayan ABSTRACT. Malaria remains a significant public health challenge in Nigeria, where climatic conditions favor transmission. Despite national declines, regional disparities persist, reflecting spatial and environmental heterogeneity. This study leverages a probabilistic learning framework, hierarchical Bayesian spatial modelling, to predict malaria prevalence among children aged 2–10 years and generate climate-sensitive risk maps for six southwestern states. Using malaria prevalence survey data from the Nigeria Malaria Indicator Surveys (NMIS) and climatic covariates from the Demographic Health Survey (DHS) spatial repository, we implemented the model within the Integrated Nested Laplace Approximation (INLA) framework, incorporating structured spatial effects via an intrinsic conditional autoregressive prior (ICAR) and unstructured random effects to capture non-spatial variability. Results reveal significant spatial heterogeneity, with Osun recording the highest prevalence (47%), followed by Oyo (45%), Ekiti (44%), Ondo (38%), Ogun (31%), and Lagos (12%). Climatic factors had a marginal influence, with aridity inversely related to prevalence, temperature positively associated, and rainfall exhibiting a non-linear effect. The results indicate that while climate plays a role, local environmental and socioeconomic determinants may also influence malaria prevalence. By integrating spatial dependencies and uncertainty quantification, this approach demonstrates how Bayesian learning can support predictive analytics and data-driven malaria intervention strategies, bridging statistical modelling and machine learning for public health policy. |
| 09:00 | A Secure Processing Platform supporting the European Health Data Space infrastructure ABSTRACT. The European Health Data Space (EHDS) Regulation (EU) 2025/327 requires that secondary use of electronic health data take place under formal data permits, in Secure Processing Environments (SPEs), and with appropriate pseudonymization safeguards. However, no open, permit-driven architecture currently converts governance artifacts into machine-enforceable processing controls. We present a 12-module governance-to-execution pipeline that takes a YAML-encoded data permit and a structured variable inventory and deterministically compiles them into runtime controls that govern every transformation performed on the data. The system generates pseudonymized (HMAC-SHA256), minimized datasets that are ready for researcher analysis, as well as a sealed post-analysis, audit-ready evidence pack. Evaluating 425,087 MIMIC-IV-ED emergency-department records, the pipeline completes the full 12-module execution in about two minutes, generates over 40 audit-able artifacts per run, while providing comprehensive coverage against a TEHDAS2-aligned governance framework. We also identified five areas for future extension: re-pseudonymization, cross-project linkage prevention, key lifecycle management, SPE-native access enforcement, and re-identification governance. The pipeline's novelty stems from its architectural design: a declarative, permit-first system in which every processing step can be traced back to a governance decision and verified using cryptographically sealed evidence outputs. |
| 09:15 | Health Data Space Nodes for Privacy-Preserving Federated Learning and Analysis PRESENTER: Guenter Schreier ABSTRACT. With an ever-increasing volume of health data produced, large-scale medical studies frequently rely on machine learning and advanced analytics. Yet, in many healthcare systems, clinical data is typically distributed across institutions and constrained by privacy and governance requirements. We present a deployable prototype that supports both federated learning and federated analysis without transferring patient-level records. The system connects five previously developed Health Data Space nodes that act as private, harmonised data providers, communicating analysis data exclusively through authenticated REST endpoints. Using open-source data, we trained a logistic regression model as a proof-of-concept containerised architecture and implemented a monitoring tool based on Prometheus and Grafana for traceability. The resulting federated model reached 0.727 accuracy and 0.456 Matthews Correlation Coefficient, compared with a centralised baseline of 0.711 accuracy and 0.422 MCC. Federated training yielded higher specificity (0.788 vs. 0.745) while slightly reducing sensitivity, illustrating a clinically relevant trade-off. In addition, federated analysis was used to compute demographic indicators (mean age and gender ratio) across nodes by aggregating local summaries rather than exposing individual data. Overall, the results indicate that the proposed node solution provides a practical pathway towards privacy-preserving secondary use of health data. |
| 09:30 | From Imbalanced Cohorts to Virtual Populations: Leakage-Aware Synthetic Data Augmentation For Heart Failure Diagnosis ABSTRACT. Early heart failure (HF) diagnosis using real-world clinical data is hindered by class imbalance, heterogeneous documentation, and differences between primary and secondary care pathways. This study presents a leakage-aware, AI-enabled workflow for synthetic data augmentation to support HF classification across care settings. The framework integrates an IEEE 2801-2022–aligned data quality assessment, synthetic minority-class balancing, virtual population generation, and downstream utility evaluation under cross-validation with strictly real-only validation. Multiple synthetic generation approaches, including oversampling, probabilistic mixture models, and deep generative methods, are systematically compared using distributional fidelity metrics and offline augmentation with XGBoost as a strong tabular baseline. Results demonstrate that moderate synthetic augmentation (40–60\%) improves balanced accuracy and sensitivity in both care settings, whereas excessive augmentation degrades generalization due to synthetic-to-real mismatch. By explicitly separating care pathways and enforcing leakage-free evaluation, this work provides practical guidance for the responsible use of synthetic data in real-world HF decision support. |
| 09:45 | Synthetic Data Generation and Multi-Dimensional Evaluation in Fondazione Italiana Linfomi (FIL) Diffuse Large B-Cell Lymphoma Clinical Cohort ABSTRACT. Synthetic patient data offer a promising solution to the privacy and accessibility constraints that hinder data-driven innovation in healthcare, particularly for clinical trial research. This paper presents a systematic study of synthetic data gener- ation and evaluation using a real-world clinical cohort from the Fondazione Italiana Linfomi. We define a standardized, model- agnostic workflow to benchmark diverse generative paradigms across the critical dimensions of fidelity, privacy and utility. Our findings reveal that no single model excels across all metrics, emphasizing the need to evaluate models on a case-by-case basis. By proposing a structured, literature-informed evaluation suite, this work facilitates context-aware model selection to support reproducible clinical research in the context of clinical trials. |
| 10:00 | Mapping Actions and Resources for Innovation in Rare Diseases (MARI-RD): Tracking Collaboration Networks in Portuguese-Speaking Countries PRESENTER: Filipe Bernardi ABSTRACT. Rare disease ecosystems are fragmented and under mapped across Portuguese speaking countries. We present a collaboration network tracking framework that integrates REDCap survey data, deduplication, standardized categories, and temporal network metrics to monitor cross border partnerships and identify hub institutions. This study is part of MARIRD (Mapping Actions and Resources for Innovation in Rare Diseases), a WP1 activity of the CPLP project focused on mapping actions and resources for innovation. The approach supports longitudinal tracking of referrals, institutional ties, and service coverage, producing reproducible indicators for policy and capacity building |
| 10:15 | VIEWER: A population health management platform for supporting electronic audit and feedback in mental health care ABSTRACT. Due to the burden and negative impacts of psychotic disorders, treatment alone is insufficient to reduce the gap in psychosis care thereby highlighting the need for greater investment in prevention strategies which are targeted towards an entire population to detect risk factors early and prevent adverse health outcomes at scale. This work evaluates the technical feasibility of implementing electronic audit and feedback (eA&F) to support population health management in mental health care by facilitating the visualisation of clinical health records to provide relevant summaries for eA&F. A Participatory Design approach was adopted as an integral framework throughout the design and development of an eA&F platform. The eA&F platform, VIEWER, was evaluated for its technical feasibility. The feasibility testing of the platform was carried out with clinical teams in the context of three different clinical use cases to support eA&F for managing the psychosis patient population. An eA&F platform for supporting population health management was developed. The platform was used successfully for eA&F to support the identification of unmet needs, variations in care processes, and inequalities between groups of patients within the psychosis patient population. It is technically feasible to develop and deploy an eA&F platform that supports population health management in secondary mental healthcare. Using advanced clinical analytics to integrate population-level data with routine clinical practice raises exciting possibilities of building more intelligent services that are adaptive to the needs of the population, and adopt a more proactive, preventative approach. The innovative eA&F platform gives clinicians the flexibility to look at clinical data that is relevant to them at the level of abstraction that is appropriate for the decision being made. By sharing the methods and steps taken in this work, useful insights can be provided to other researchers and practitioners that are developing similar digital health interventions. |
Parquet Lobby Area (in the Main Lobby). Please enjoy your coffee and visit the Showcases Area in the Panorama Room via the terrace.
Andreas Panayides (CYENS Centre of Excellence, Cyprus)
SupportAI: A Multimodal AI Platform for Clinical Decision Support - Project Overview ABSTRACT. This paper presents the SupportAI project, a comprehensive research initiative funded by the Italian National Recovery and Resilience Plan (PNRR) through the Tech4You program. The project develops a modular AI platform that integrates heterogeneous biomedical data through live HL7 FHIR R4 and DICOM infrastructure, enabling real-time clinical decision support. The platform combines advanced 3D medical imaging with Fourier frequency-domain analysis, AI-powered segmentation, generative AI assistance, 3D bioprinting capabilities, and telecollaboration infrastructure. Technical validation at Technology Readiness Level 6 demonstrates robust performance with 0.00% error rates across all modules under concurrent load, sub-15 ms response times for imaging workflows, and successful integration of RAG-based generation with live clinical data retrieval. Initial results confirm the technical feasibility of unified multimodal biomedical data integration within a standards-compliant clinical AI platform. |
AVARIS-SmartPLAIGO: Integrating AI-Driven Decision Support into Cyprus’s National Ambulance Dispatch System ABSTRACT. Ambulance dispatch is a time-critical process in which the quality of resource allocation decisions can directly influence patient outcomes. In Cyprus, all pre-hospital emergency operations are coordinated through AVARIS, the national ambulance dispatch and fleet management system developed by the eHealth Laboratory at the Cyprus University of Technology and operated by the national ambulance services. Although AVARIS effectively supports daily dispatch workflows, it does not currently incorporate predictive analytics or optimization capabilities. The SmartPLAIGO project (CODEVELOP-AG-SHHE/ 0823/0112), co-funded by the Cyprus Research and Innovation Foundation, addresses this gap by introducing an intelligent decision support layer comprising a GIS module for travel-time estimation, an optimization engine for ambulance-to-incident assignment, and a dedicated communication platform with shift scheduling capabilities. The framework is designed to operate alongside AVARIS while preserving the dispatcher’s full authority over every dispatch decision. This paper describes the AVARIS system, the SmartPLAIGO architecture and its integration with AVARIS, and reports on five pilot evaluations conducted with active EMS personnel under both simulated and field conditions. |
Deploying Telemonitoring Ecosystems for Cancer Care Across Europe: The eCAN and eCAN Plus Joint Action Project Experience ABSTRACT. Cancer remains one of the leading causes of mortality in Europe, placing significant pressure on healthcare systems and highlighting the need for innovative approaches to long-term patient monitoring and follow-up care. Digital health technologies, particularly telemedicine and telemonitoring solutions, provide new opportunities to improve patient management, enhance communication between patients and clinicians, and support data-driven clinical decision making. Within this context, the European Joint Action projects eCAN (Strengthening eHealth including telemedicine and remote monitoring for cancer prevention and care) and eCAN Plus aim to enhance the digital capabilities of cancer centres across Europe by developing and deploying interoperable telemedicine infrastructures and remote patient monitoring services.In these initiatives, the Cyprus University of Technology (CUT) contributes to the design and development of a comprehensive telemonitoring ecosystem that integrates mobile health applications, clinician dashboards, and middleware services connecting wearable devices with healthcare platforms. The system enables cancer patients to report symptoms and patient reported outcomes while simultaneously collecting physiological and behavioral data from wearable sensors such as smartwatches. These data streams are securely transmitted through a middleware layer to a centralized infrastructure that supports data aggregation, interoperability, and secure communication with clinical systems.Clinicians access the collected information through web-based dashboards that provide real-time visualization of patient metrics, symptom trends, and activity indicators. This infrastructure supports remote follow-up of cancer patients, early identification of potential complications, and improved patient engagement during rehabilitation and post-treatment monitoring. The platform has been deployed within pilot activities across multiple European countries participating in the Joint Actions, demonstrating the feasibility of integrating telemonitoring services into real clinical workflows and heterogeneous healthcare environments.Beyond enabling remote monitoring, the architecture has been designed to support scalability and interoperability with European digital health initiatives, including cross-border health data exchange and future AI driven analytics. The integration of wearable technologies, mobile health applications, and clinician decision support dashboards provides a foundation for advanced data analytics and personalized cancer care pathways. The experience gained through the implementation and deployment of this ecosystem provides valuable insights into the technical, organizational, and interoperability challenges involved in scaling telemonitoring infrastructures across national healthcare systems.This paper presents the architecture, implementation approach, and deployment experience of the telemonitoring ecosystem developed within the eCAN and eCAN Plus Joint Action projects, highlighting lessons learned and opportunities for future expansion of digital health solutions supporting cancer care across Europe. |
REMO: An Integrated Remote Patient Monitoring Initiative to Reduce Clinical Workload Through Workflow-Oriented Care Pathways ABSTRACT. REMO is an international research and innovation initiative that develops and validates an integrated remote patient monitoring approach aimed at reducing clinical workload while improving continuity of care outside traditional clinical settings. The project is operationalized through two complementary clinical use cases spanning transitional and preventive or supportive care. Use Case 1, deployed in Lithuania, targets spinal rehabilitation through a hybrid model that combines onsite assessment with monitored home exercise execution and clinician review through dashboard-supported feedback loops. Use Case 2, deployed in Portugal, targets integrated monitoring of sleep and mental well-being for adults and older adults, combining low-burden longitudinal sensing with workflow-oriented triage, recommendations, and escalation when meaningful changes are detected. REMO is structured around three cross-cutting building blocks that support both use cases: multimodal data systems and sensors; trustworthy AI for interpretable inference in real-world conditions; and clinical workflow integration to minimize alert fatigue and enable actionable follow-up. The paper presents the project rationale, consortium, and integrated framework, details both use cases and their validation strategy, and reports early implementation milestones. |
VELES Excellence Hub - Strengthening the South-East Europe Smart Health Regional Excellence and Boosting the Innovation Potential ABSTRACT. Smart Health is EU Strategic Value Chain that contributes to growth, jobs and competitiveness. Health Data is the main enabler for the value chain, but it also depends on cutting edge technologies – AI, cloud computing, IoT, to integrate the dispersed knowledge and support innovative healthcare solutions and services. EC sets the creation of a European Health Data Space as a main priority, to promote better exchange and access to health data. On a regional level, the RIS3 strategies of the involved widening countries – Bulgaria, Romania, Greece and Cyprus emphasize the need for based on Big Data, AI and IoT digitalized healthcare, to enable personalized medicine, informed decisions, and improved disease prediction. VELES raises the level of innovation excellence in the South-East EU through creating a sustainable place-based innovation ecosystem, enabled by Regional Smart Health Data Space. The Regional Smart Health Data Space will be demonstrated through the design of 4 interrelated pilots on Cancer treatment (Greece); Personalised/precision medicine of Alzheimer (Bulgaria); Cerebral tumours (Romania) and Dementia (Cyprus). The aim of VELES is to foster health data sharing regional and national strategies, to secure improved clinical practice, to preserve patient’s privacy and to empower citizens’ smart healthcare through access to innovative, cyber secure and data driven digital health services. |
The LIFEMap Project: Genomic Mapping of Paediatric Pathologies for Personalised Prevention of Cardiovascular and Neoplastic Diseases in Adults PRESENTER: Giordano D'Aloisio ABSTRACT. The LIFEMap Project presents a framework for personalised medicine by linking paediatric pathologies to adult cardiovascular, rheumatological, and neoplastic diseases through genomic and environmental profiling. Recognising that early-life factors affect long-term health, the project aims to identify childhood conditions and their genetic and environmental correlates to predict adult disease susceptibility. LIFEMap gathers clinical, phenotypic, and genomic data from over 5,000 subjects within a secure, federated platform for large-scale analysis. Its objectives include developing infrastructure for whole-genome sequencing, advanced AI pipelines, and a biobank for prospective studies. Current progress includes defining the platform’s architecture and harmonising retrospective datasets. Ultimately, LIFEMap seeks to enable early diagnosis and targeted interventions, advancing precision medicine using insights from early life. |
Automated model-based surgical planning tool for PROstaTE Cancer brachyTherapy – PROTECT ABSTRACT. The PROTECT project takes up the challenge of boosting the accuracy and efficiency of High Dose-Rate BrachyTherapy (HDRBT) through an automatic, data-driven, cloud-based, digital platform for the realistic training of preoperative and intraoperative planning of HDRBT. We present a cutting-edge generative Artificial Intelligence (AI) framework for automatic delineation of the prostate gland and urethra during HDRBT, which will provide a substantial speed-up over manual interventions, reducing the chance of errors, optimizing the application of therapy and minimizing harmful side effects. In addition, we demonstrate a novel in silico modeling tool for personalized prediction of needle insertion that accounts for the induced deformation on the prostate during the HDRBR procedure. |
AI-RBD: An Integrated Framework for Privacy-Preserving AI Analysis of REM Behavior Disorder in Video-Polysomnography ABSTRACT. This document describes the AI-RBD initiative, a research project focused on developing artificial intelligence and computer vision solutions for the automated detection of REM Sleep Behaviour Disorder (RBD). By leveraging video-polysomnography (vPSG), the project aims to enhance diagnostic precision and clinical efficiency. |
DeepECG-Kit: A Practical and Reproducible Framework for ECG Deep Learning ABSTRACT. We present DeepECG-Kit, an open-source Python library for reproducible ECG deep learning research. The library unifies multiple datasets under a standardized preprocessing pipeline and provides reference implementations of seventeen neural network architectures. These span convolutional, recurrent, attention-based, state-space, and hybrid approaches. Built on PyTorch, DeepECG-Kit supports automated checkpointing, stratified data splitting, and comprehensive evaluation through both a command-line interface and a Python API. We demonstrate the library by benchmarking six architectures on four datasets covering multi-label diagnostic classification, multi-class rhythm detection, binary atrial fibrillation detection, and cross-dataset AF classification under identical conditions. DeepECG-Kit is freely available at https://github.com/stevenah/deepecg-kit and can be installed via pip install deepecg-kit. |
From Prototype to Data Quality: Usability Findings for a Patient-Facing AI-Empowered Mobile Application for Automatic Cancer Pain Assessment ABSTRACT. Artificial intelligence (AI)-based Automatic Pain Assessment (APA) systems using multimodal inputs (e.g., facial expression, speech, and self-report) can enable continuous, multidimensional monitoring of cancer-related pain in outpatient care. However, effective implementation and clinical integration are hindered by a lack of human-centered design and real-world usability data. Poor interface design can lead to low-quality data and patient abandonment, particularly in vulnerable populations. This study presents the first stage usability evaluation of the SENSAI Pain mobile application, an AI empowered tool for structured pain self-report and audiovisual data collection. We conducted a mixed-method usability evaluation with ten healthy volunteers utilizing task-based testing, standardized questionnaires, and interviews. Although perceived usability was rated as excellent, task-based metrics and qualitative findings revealed critical interaction barriers. These issues directly affected the completeness and quality of the data collected for future APA processing. These findings show that early usability evaluation can reveal critical data-collection risks that may not be captured by subjective usability scores alone. Addressing identified workflow barriers through iterative, human-centered redesign is a necessary step prior to evaluation with cancer patients and before integrating APA model processing for longitudinal outpatient deployment. |
Opportunistic LoRaWAN-Based Step-Count Monitoring for Postoperative Follow-Up in Low-Connectivity Rural Settings PRESENTER: Diego Narciandi-Rodríguez ABSTRACT. Postoperative remote monitoring solutions based on smartphone applications face accessibility barriers in rural and elderly populations who may lack compatible devices, stable connectivity, or digital literacy. This paper presents a LoRaWAN-based mobility monitoring prototype designed for settings where patients have no Internet access, no SIM card, and no smartphone, enabling objective step-count telemonitoring without patient-side configuration or burden. The system follows an opportunistic offloading model: an end node equipped with a nine-axis inertial measurement unit for hardware-level step counting accumulates mobility data locally and uploads it automatically when the patient enters the coverage area of a single fixed gateway, with no user action required. Decoded step-count records are delivered to a clinical dashboard, enabling healthcare providers to track postoperative mobilization progress in support of Enhanced Recovery After Surgery (ERAS) protocols. A field evaluation on an urban campus over 61 days yielded 137 received frames at distances up to 1,254~m from a gateway deployed indoors, with a mean signal-to-noise ratio of +5.7~dB and successful decoding at values as low as -11.75~dB, confirming LoRa sub-noise-floor demodulation. Indoor measurements showed a 2.4~dB mean Signal-to-Noise Ratio (SNR) reduction relative to outdoor positions, with all indoor frames successfully decoded. These results support the feasibility of a single-gateway deployment inside a rural health center or pharmacy for connectivity-free postoperative step-count telemonitoring, reducing patient and staff burden compared with in-person follow-up or return-and-report approaches. |
| 11:00 | Transfer Learning for ECG Classification: Effects of Shared Labels, Granularity, and Fine-Tuning in Small Clinical Datasets PRESENTER: Chito Patiño ABSTRACT. Accurate electrocardiogram (ECG) classification using deep learning requires large annotated datasets. Typically, hospitals lack sufficient amounts of labeled data for model training. Transfer learning, which adapts models pretrained on large, open datasets (source domain) to local settings (target domain), offers a promising solution for addressing this issue. However, its effectiveness can greatly vary depending on how well the data distributions and the scope and specificity of diagnostic labels match between the source and target domains. In this study, we systematically evaluate transfer learning strategies for ECG classification in settings with limited annotated target data (2000 records or fewer), focusing on how the amount of shared diagnostic labels and the level of detail in the label categories differ between source and target data. Using a ResNet architecture, we conducted three experiments: (1) assessing target model performance as a function of the number of shared labels between the source and target label sets, (2) analyzing the effect of source label specificity (coarser vs. fine-grained categories) on the model’s performance in the target domain, and (3) evaluating the effect of fine-tuning depth. The results showed transfer learning to be ineffective with only one shared diagnostic labels, whereas model performance increasingly improved as the number of shared labels increased. Pretraining with medium-level label granularity delivered the best results when few labels were shared and data were scarce, while fine-grained pretraining excelled when more labels were shared. Deeper fine-tuning improved model performance and enabled the use of coarser source labels. Altogether, these findings provide guidance on when transfer learning is likely to offer better performance than models trained solely on local hospital data. |
| 11:15 | Integrating the Clarke Error Grid into Deep Learning to Enhance Clinical Performance in Glucose ABSTRACT. Accurate blood glucose level (BGL) prediction based on Continuous Glucose Monitoring (CGM) data is a cornerstone in modern diabetes mellitus management. BGL prediction models based on Artificial Neural Networks or deep learning are typically optimized using global error metrics, such as Mean Squared Error (MSE). However, these metrics often fail to account for the clinical significance of prediction errors, which varies drastically across different BGL ranges.This work introduces a glucose-range-specific cost function designed to prioritize clinical safety by emphasizing performance in critical regions, specifically hypoglycemia and hyperglycemia. The proposed cost function is derived from the Clarke Error Grid (CEG), a clinical standard performance metric used in BGL prediction assessment that weights errors based on their potential risk to the patient.This novel cost function is evaluated using the T1DiabetesGranada dataset and a Long Short-Term Memory (LSTM) architecture to predict BGL values at a 60-minute prediction horizon. Experimental results demonstrate that the glucose-range-specific cost function significantly enhances performance in critical BGL ranges. Notably, the model achieved up to a 75% improvement in prediction performance within the hypoglycemic range compared to standard MSE-based BGL prediction models, without compromising global prediction performance. These findings suggest that integrating clinical risk into the learning process produces models that are both robust and more safely aligned with the requirements of real-world T1D care. |
| 11:30 | ABSTRACT. Antimicrobial resistance represents a critical global health threat, exacerbated by inappropriate empirical antibiotic prescribing. To address this, we developed a Deep Learning-based Clinical Decision Support System trained on over 200,000 historical antibiogram records from three French hospitals to predict antimicrobial resistance and assist clinicians in therapy selection at the bedside, while clinical specimen culture and antimicrobial susceptibility testing are processed at the laboratory. Our model achieves high performance (AUROC of 0.92) and reliable uncertainty calibration (AUSE of 0.03) through Bayesian inference. But clinical deployment faces the persistent challenge of concept drift, as resistance patterns evolve over time. To address this, our present contribution applies the ADWIN algorithm to monitor both error rates and uncertainty signals, enabling detection of concept and virtual drift. This dual-level approach allows proactive identification of emerging resistant strains and ensures the long-term safety and reliability of the system in dynamic clinical settings. |
| 11:45 | A Hierarchical Deep Learning Framework for Rapid Eye Movement Behavior Disorder Detection PRESENTER: António Cardoso ABSTRACT. Rapid Eye Movement (REM) Behavior Disorder (RBD) is a parasomnia strongly associated with future neurodegenerative diseases, yet its diagnosis depends on labor-intensive polysomnography (PSG) analysis and analytical indices such as the REM Atonia Index (RAI). Automated detection is challenging due to the temporal sparsity and variability of RBD-related events. This work presents SOMNUS-RBD, a hierarchical deep learning framework for patient-level RBD prediction from REM sleep PSG. REM segments are encoded into embeddings, then processed with channel-level attention pooling and temporal LogSumExp aggregation to capture sparse pathological activations. Five data configurations were evaluated on a 49-patient cohort using 10-repeated 5-fold cross-validation: (1) chin electromyography (EMG) embeddings alone, (2) multichannel EMG with independent per-channel embeddings, (3) multichannel EMG with joint multi-channel embeddings, (4) multimodal EMG, electroencephalography (EEG) and electrooculography (EOG) with independent per-channel embeddings, and (5) multimodal EMG, EEG, and EOG with unified per-modality embeddings. Using only chin EMG, the proposed framework showed improvements compared to the analytical baseline (RAI), as accuracy increased from 0.735 to 0.786 and recall from 0.577 to 0.781, while maintaining competitive precision and AUC. Independent multichannel EMG achieved the best performance with an AUC of 0.832 and a recall of 0.827. In contrast, grouping channels prior to embedding extraction reduced discriminative performance. Attention weights highlighted EMG as the dominant modality compared to EEG and EOG, and temporal importance scores aligned with physiologically meaningful segments of loss of atonia. These findings suggest that SOMNUS-RBD surpasses traditional analytical indices while providing intrinsic channel- and timestep-relevant measures for RBD detection. |
| 12:00 | Group Therapy for Elderly Depression: Deep Learning Based on Large Models of Music Affective Computing ABSTRACT. Music therapy offers a non-invasive alternative; however, its effectiveness depends heavily on individualized emotional matching. To address this limitation, this study proposes an Intelligent Music Therapy System that integrates Music Affective Computing Models with Internet of Things–based wearable sensing for personalized emotional intervention. The system incorporates a deep learning–based Music Affective Computing Model to recommend music and an IoT framework to continuously acquire physiological signals, including heart rate variability, from elderly users. Six representative Music Affective Computing Model architectures were systematically evaluated, among which a hybrid Fractal Convolution Neural Network–Long Short-Term Memory–Transformer model demonstrated the highest classification accuracy and generalization stability in Chinese classical and ethnic music emotion recognition. To validate clinical applicability, an intelligent music therapy system was deployed in a randomized controlled group music therapy trial for elderly individuals. Experimental results indicated significant improvements in heart rate variability indices and depressive mood scores compared with the control condition; therefore, the proposed system can effectively support emotion-aware music intervention in nursing home environments. |
| 12:15 | From Concepts to Evidence: Literature-Grounded Skin Lesion Diagnosis with Vision–Language and Retrieval-Augmented Models PRESENTER: Gabriel Santos Martins Dias ABSTRACT. Artificial intelligence has shown strong potential for dermatological diagnosis, but clinical adoption requires not only accurate predictions but also transparent and verifiable reasoning. Recent concept-based approaches improve interpretability by exposing clinically meaningful dermoscopic features predicted from images; however, the diagnostic explanations produced by large language models (LLMs) remain internally generated and are not explicitly grounded in external biomedical evidence. We propose a literature-grounded diagnostic framework that integrates dermoscopic concept extraction with retrieval-augmented generation (RAG) over a dermatology-focused PubMed corpus. The framework combines vision–language models for lesion concept prediction, dense biomedical literature retrieval, multi-stage reranking strategies, and LLM-based reasoning to generate diagnoses accompanied by explanations supported by retrieved scientific evidence. Beyond introducing this architecture, we conduct a systematic analysis of retrieval design choices—including diversity-based retrieval, cross-encoder reranking, and natural language inference filtering—and examine how these strategies influence both diagnostic performance and explanation grounding. Experiments on Derm7pt, HAM10000, and PH² under the same binary setting used in prior concept-based work (nevus vs. melanoma) achieve balanced accuracy up to 0.792, 0.735, and 0.831, respectively. Our results reveal a trade-off between retrieval diversity and semantic precision: diversity-oriented strategies improve diagnostic performance, while precision-oriented reranking yields explanations more strongly supported by biomedical evidence. By explicitly linking predictions to retrievable scientific literature and enabling claim-level grounding analysis, the proposed framework supports auditable and evidence-grounded dermatological AI systems. |
| 11:00 | Bridging 3D Deep Learning and Curation for Analysis and High-Quality Segmentation in Practice ABSTRACT. Accurate 3D microscopy image segmentation is essential for quantitative bioimage analysis, yet state-of-the-art foundation models frequently produce error-prone results that necessitate manual proofreading. Manual curation remains the bottleneck for generating high-quality training data and ensuring biological downstream accuracy. We present VessQC, an open-source tool for uncertainty-guided curation of volumetric segmentations. VessQC integrates uncertainty maps to prioritize user attention on regions with high error probability, optimizing the human-in-the-loop workflow. In a study of 3D light-sheet microscopy volumes of murine brain vasculature, uncertainty-guided correction improved error detection recall significantly compared to conventional curation, without increasing total processing time. VessQC thus enables efficient, human-in-the-loop refinement of volumetric segmentations and bridges a key gap in real-world applications between uncertainty estimation and practical human-computer interaction. The software is freely available at github.com/MMV-Lab/VessQC. |
| 11:10 | BEA-Net: Boundary-Aware 3D Attention Network for MRI Knee Cartilage Segmentation PRESENTER: James Battye ABSTRACT. Segmenting knee cartilage from magnetic resonance images is vital to understanding the pathogenesis and progression of knee osteoarthritis. However, it presents several challenges due to the complex morphology and thin structure of knee cartilage. Dedicated boundary learning has been shown to improve segmentation predictions from deep learning models when segmenting tissues with complex or unclear boundaries, but the application of dedicated boundary learning for 3D segmentation of knee cartilage from MRI has been limited. In this work, we introduce BEA-Net, a multi-task boundary-aware attention network for 3D segmentation of knee cartilage from MRI. BEA-Net uses a dual-branch architecture with an auxiliary decoder dedicated to learning boundary information. A novel boundary-enhancement attention module is used to amplify and focus on salient boundary features to refine boundary predictions. Learnt boundary features are fused with primary-decoder features to enhance segmentation predictions, and the network is optimised using a combination loss that encourages inter-decoder consistency. BEA-Net was evaluated on an MRI dataset from the Osteoarthritis Initiative, outperforming other state-of-the-art models when segmenting four types of knee cartilage and achieving Dice scores of 89.52%, 88.46%, 85.87%, 87.11% when segmenting the femoral, tibial, and patellar cartilage, and the meniscus respectively. |
| 11:20 | Cross-Dataset Generalization in Breast MRI Tumor Classification via Class-Wise Dataset Mixing ABSTRACT. Breast MRI is highly sensitive for detecting breast tumors, but exam volumes are large and time-consuming to interpret. Although deep learning has shown strong performance on internal test splits, models often fail to generalize across institutions because of domain shift and dataset-origin bias. We study this failure mode and propose a simple training strategy to improve cross-dataset generalization for breast tumor classification. We train EfficientNet-B3 and WaveViT-Small using Duke Breast Cancer MRI and fastMRI, and we evaluate exclusively on the independent multi-center MAMA-MIA cohort. A controlled confounded setup, where label is perfectly correlated with dataset origin, leads to near-chance external accuracy (0.5048--0.5265) despite high internal performance, indicating shortcut learning. We then break this correlation by mixing both datasets within each class, while using patient-level splitting, augmentation, and strict leakage controls. On MAMA-MIA, dataset mixing improves accuracy/F1 to 0.8463/0.8625 (WaveViT-Small) and 0.8884/0.8994 (EfficientNet-B3), with EfficientNet-B3 reaching 0.9973 recall. These results show that controlling dataset-origin bias and enforcing strict external validation are key to developing more reliable breast MRI classifiers. |
| 11:30 | Multimodal AI for clinically significant prostate cancer detection: A comparative study of structured clinical-based, imaging and fusion models ABSTRACT. Accurate detection of clinically significant prostate cancer (csPCa) remains a major clinical challenge due to the limited cancer specificity of prostate-specific antigen screening, the invasiveness of biopsy procedures, and the inter-reader variability in magnetic resonance imaging interpretation. In this setting, artificial intelligence provides promising data-driven strategies to improve diagnostic precision and support clinical decision-making. This study presents a comparative analysis of modeling strategies for csPCa detection, examining approaches based on structured clinical variables, magnetic resonance imaging data alone, and their multimodal integration. The apparent diffusion coefficient, derived from diffusion-weighted imaging, is included to quantify its incremental predictive value. Results show that clinically based models achieve strong and stable performance, whereas image-only approaches are mainly limited by data scarcity. Multimodal integration provides competitive results and improves sensitivity in clinically relevant operating regions, although its gains remain influenced by the representational capacity of the imaging branch. Overall, the findings provide practical guidance on selecting appropriate modeling strategies for csPCa prediction under different data availability constraints, clarifying the trade-offs between clinical, imaging-based, and multimodal approaches. |
| 11:40 | A Semi-Supervised Multiclass Pixel-Domain Classification Approach for Breast Cancer Microscopy Images Based on Nonlinear Metrics ABSTRACT. Breast cancer is one of the most prevalent and deadly diseases worldwide, representing a major public health challenge that affects millions of individuals each year. Histopathological analysis, more specifically the immunohistochemistry (IHC) technique, plays a fundamental role in diagnosis by enabling the identification and quantification of positive and negative cellular markers. However, the analysis of microscopy images is time-consuming, subjective, and prone to human error and may presents variability. With the recent computational advancements, many automated methods for image analysis and decision support have been developed for breast cancer diagnosis based on IHC microscopy images. However, the objective metric used to guide similarity or dissimilarity of biomarker colors when based on linear metrics is not expressive to characterize the complexity of multi-biomarker scenarios inherent to tissue characterization by IHC. This limitation represents a significant drawback in properly modeling nonlinear patterns and morphological variations across different cellular classes. To address these challenges, this paper proposes a multiclass classification approach based on nonlinear metrics, specifically designed for breast cancer microscopy images in IHC. The proposed method extends beyond linear classifiers by employing a nonlinear model based on polynomial feature expansion of the Mahalanobis distance, enabling the capture of complex, nonlinear relationships among cellular patterns in IHC images. Experimental results obtained on breast cancer microscopy datasets demonstrate that the proposed approach achieves promising performances, with a micro-averaged F1-score of 0.76, overall precision of 0.77, and specificity of 96.3% for positive nuclei detection, indicating robustness to false-positive classifications. |
| 11:50 | A Detection-Driven Pipeline for Nuclei Classification in Nasal Cytology PRESENTER: Ciro Russo ABSTRACT. Nasal cytology is a minimally invasive diagnostic technique used to assess inflammatory and allergic conditions of the upper airways through microscopic examination of cellular specimens. Despite its clinical relevance, computational studies in this domain remain limited, and the impact of nuclei localization quality on downstream classification performance has not been systematically investigated. In clinical practice, nuclei must first be localized before morphological interpretation can occur, introducing a sequential dependency between detection and classification. In this work, we introduce and evaluate a detection-driven pipeline for instance-level nuclei classification in nasal cytology using the Nasal Mucosa Cell Dataset (NMCD), the first publicly available dataset providing structured nucleus-level annotations in this field. The proposed pipeline mirrors the sequential clinical workflow by explicitly separating nuclei localization and classification, enabling a controlled analysis of the interaction between these two stages. To investigate the role of localization quality, we compare classification performance under two complementary configurations: an ideal scenario based on reference nucleus annotations and a detection-driven scenario operating exclusively on automatically localized nuclei. Experimental results show that end-to-end classification accuracy decreases by 7.7\% under detection-driven conditions compared to ideal localization, highlighting the impact of localization errors on downstream morphological classification. |
| 12:00 | GTDiagnosis: Intelligent Pathological Diagnosis of Gestational Trophoblastic Diseases via Visual-Language Deep Learning Model ABSTRACT. The pathological diagnosis of gestational trophoblastic disease(GTD) takes a long time, relies heavily on the experience of pathologists, and the consistency of initial diagnosis is low, which seriously threatens maternal health and reproductive outcomes. We developed an expert model for GTD pathological diagnosis, named GTDoctor. GTDoctor employed our innovative multi-scale adaptive attention mechanism for pixel-level lesion segmentation, and builds a decision model through structured and unstructured feature extraction, combined with a large language model to provide personalized pathological analysis results. We developed a software system, GTDiagnosis, based on this technology and conducted clinical trials. The retrospective results demonstrated that GTDiagnosis achieved a mean precision of over 0.91 for lesion detection in pathological slides. In prospective studies, pathologists using GTDiagnosis attained a Positive Predictive Value of 95.59%. The tool reduced average diagnostic time from 56 to 16 seconds per case. GTDoctor and GTDiagnosis offer a novel solution for GTD pathological diagnosis, enhancing diagnostic performance and efficiency while maintaining clinical interpretability. |
| 12:10 | Performance and Feature Analysis in Cataract Detection Using Artificial Intelligence Models ABSTRACT. Cataract remains a leading cause of visual im- pairment worldwide, motivating the development of automated screening tools based on retinal fundus photography. This study examines how explicit feature extraction affects artificial intel- ligence performance for binary cataract classification (normal vs. cataract) and discusses the resulting implications for inter- pretability and explainability. We compare end-to-end deep con- volutional models (ResNet-50 and an InceptionV3+ResNet50V2 configuration) with a traditional machine-learning baseline (sup- port vector machine, SVM) under three representation settings: raw images (no feature extraction), local binary patterns (LBP), and histogram of oriented gradients (HOG). Model performance is evaluated using accuracy, recall, F1-score, and precision–recall area under the curve (PR-AUC). Overall, end-to-end deep learning achieved the strongest dis- crimination without explicit feature engineering; notably, ResNet- 50 operating directly on raw images attained the best F1-score (86.32±0.94%). In contrast, classical SVM benefited substantially from engineered descriptors, improving from poor performance on raw pixels to an F1-score of 80.45 ± 2.18% when combined with HOG. These findings indicate that feature extraction can enhance classical pipelines and provide intrinsically interpretable representations, whereas deep networks largely benefit from learning task-specific features directly from fundus images. The results highlight a practical performance–interpretability trade-off relevant to cataract screening systems, particularly in resource-constrained deployment scenarios. |
| 11:00 | Deep Learning–Based Hospital Admission Prediction from Spanish Psychiatric Electronic Health Records PRESENTER: Arturo Crespo-Álvaro ABSTRACT. The automatic identification of patients requiring psychiatric hospitalization is critical for ensuring clinical safety and optimizing resource allocation in emergency mental health services. This study presents a deep learning–based approach for binary hospitalization prediction from unstructured Spanish psychiatric clinical notes. A dataset of 500 emergency psychiatric evaluations collected from CAULE was curated and anonymized, resulting in 409 validated records after preprocessing and outcome filtering. Several Transformer-based architectures were evaluated, including general-purpose English models (BERT), Spanish-specific pretrained models (BETO), and a clinically oriented English model (ClinicalBERT). Models were fine-tuned under stratified 70–15–15 splits and further optimized through systematic hyperparameter search. Results show that Spanish-pretrained models achieve the best overall performance, with BETO-cased reaching 0.952 in both Accuracy and F1-score on the independent test set. Hyperparameter optimization significantly improves performance across architectures, particularly for language-aligned models. Findings highlight the importance of linguistic compatibility and optimized training configurations when applying Transformer models to psychiatric clinical text. This work contributes to advancing clinical NLP research in Spanish and supports the development of AI-driven decision-support tools for mental health care. |
| 11:15 | Unified Vascular Score: A Data-Driven Framework for Personalized Cardiovascular Risk Stratification Using Arterial Stiffness PRESENTER: Kiran V Raj ABSTRACT. Traditional risk scores (TRS) such as Framingham, WHO, and Globorisk guide preventive decision-making by predicting future cardiovascular events using conventional risk factors. However, their reliance on population-based models limits their ability to capture underlying subclinical vascular damage, thereby restricting granular risk differentiation and effective personalized prevention. To address this limitation, we propose a Unified Vascular Score (UVS) that integrates arterial stiffness-based vascular health markers. In a study of 205 participants, we have validated UVS against TRS, and investigated its ability to further capture vascular heterogeneity across age–hypertension phenotypes within and across TRS risk strata. Inter-score comparison showed moderate agreement among TRS, with Globorisk aligning most closely with the consensus stratification (87%). When compared with TRS frameworks, UVS demonstrated a consistent monotonic increase across low-to-high risk strata. While TRS grouped all four physiological subgroups (Young Normal, Young Hypertensive, Old Normal, Old Hypertensive) within the same low-risk category, the proposed UVS revealed clear separation among them. This preserved discrimination highlights underlying vascular heterogeneity and demonstrates the ability of UVS to capture biologically meaningful differences that remain masked within conventional categorical risk stratification. These findings suggest that UVS complements traditional risk models while providing finer resolution within each risk category, enabling improved detection of subclinical vascular differences among individuals with similar conventional risk profiles. |
| 11:30 | Multimodal ECG Abnormalities Classification Approach Based on Anamnesis Patient Data and Signal Integration ABSTRACT. Automated interpretation of 12-lead electrocardiogram (ECG) images offers critical value as a clinical decision support tool, enabling rapid and accurate patient triage in fast-paced emergency environments. However, existing classification models often rely solely on visual data, ignoring the essential patient history utilized by human cardiologists. This paper proposes a novel multimodal deep learning architecture that integrates raw static ECG images with baseline cardiovascular risk factors (e.g., age, sex, blood pressure) to optimize diagnostic precision. the model employs a Contrastive Language-Image Pre-training (CLIP) backbone, utilizing free-text cardiologist reports as an auxiliary supervisory signal during training to extract complex morphological features without requiring manual annotations. During inference, patient clinical metadata generates an attention mask that dynamically scales the extracted visual embeddings. This early-fusion gating mechanism balances information across modalities, enabling the model to adjust its visual processing based on each patient’s risk profile. Evaluated on ten highly imbalanced cardiovascular abnormalities from the MIMIC-IV dataset using Focal Loss, the proposed model achieves a weighted average F1-score of 77.4%, overall AUC of 95.83%, Accuracy of 93.82%, Precision of 77.08% and Recall of 77.85%, establishing a highly competitive benchmark against current state-of-the-art multi-label classifiers. Additionally, the integration of clinical context significantly improved the predictive confidence for critical ischemic events, boosting the detection rate for Acute Myocardial Infarction (AMI) by over 61% compared to an image-only baseline. These results demonstrate that patient clinical context is an indispensable prior, effectively transitioning theoretical ECG classifiers into robust, safety-first automated triage tools. |
| 11:45 | High risk and preventable harm groups identified in clustering of older patients on features associated with adverse drug reactions PRESENTER: Volodymyr Chapman ABSTRACT. Adverse drug reactions (ADR) leading to hospitalization cause considerable physical and emotional harm to patients and have been estimated to cost the UK NHS £2.21Bn per year. Structured Medication Reviews (SMRs) aim to prevent such harm through comprehensive review and revision of medications. Challenges remain in objective selection of patients for SMR, considering both risk of harm and potential for medicine optimization. We present subgroups of older patients with distinct characteristics and use of medications previously reported to associate with preventable ADRs. Published ADR event codes were used to classify ADR hospitalizations in 634k electronic healthcare records from the CPRD AURUM dataset, filtered for patients defined as older (65+ years) on 01/04/2019. Time for ADR hospitalization was monitored from this date to 31/03/2020. Patients were split into training (90%) and testing (10%) partitions, stratified for equal proportions of ADR hospitalization. LASSO Cox regression extracted features associated with ADR hospitalization risk from 1,014 features describing medication, patient demographics and clinical characteristics. Finally, semi-supervised clustering was performed on extracted features to group patients on ADR risk. LASSO Cox regression extracted 74 features associated with ADR hospitalization, including scaled age (hazard ratio / HR: 2.48), alcohol liver disease (HR: 1.75) and unrecorded ethnicity (HR: 1.44). Patients clustered into two high ADR hospitalization older groups, two disease-specific groups and a healthy ageing group. This work identified 5 patient subgroups with characteristic features and ADR risk. Future work will investigate scope for medicines optimization within groups, for prevention of ADR harm. |
| 12:00 | Subgroup Analysis for Risk of Fall Correlation Using the UK Biobank dataset PRESENTER: Efterpi Karapintzou ABSTRACT. Falls are a major public health problem, with serious implications for the functionality and quality of life of adults. Although various machine learning approaches have been proposed for predicting fall risk, most are based on uniform models for heterogeneous populations, ignoring the substantial differences between disease categories. This study proposes a machine learning framework based on subgroup analysis to predict the risk of falls in different disease categories using data from the UK Biobank. Participants were grouped into clinically relevant disease categories, and multiple machine learning models were developed and evaluated for each subgroup. Model performance was evaluated using accuracy, sensitivity, specificity, ROC-AUC, and F1-score, while the calibration of probabilistic predictions was examined using the Brier score. In addition, explainable artificial intelligence techniques were applied through SHAP to interpret predictions. These results indicate that the performance of the models differs greatly across the disease types, resulting in moderate to high values of ROC-AUC, up to a maximum of 0.95 in some subgroups. Overall health was identified as the most important factor in most subgroups, whereas the importance of the activity factor was higher in subgroups of hematological diseases. In summary, these findings highlight the potential of subgroup-based, interpretable machine learning models to support more personalized and clinically actionable fall risk assessment in populations with chronic diseases. |
| 12:15 | Correlation-Based Validation of Multidomain Geriatric Assessment Tools for Older Adults Monitoring ABSTRACT. Multidomain geriatric assessment batteries are increasingly integrated into digital health platforms for older adults monitoring, yet their internal consistency and cross-domain behavior under real-world conditions remain insufficiently examined. This study presents a correlation-based evaluation of a multidomain geriatric assessment dataset collected at two time points from 616 community-dwelling older adults (i.e., above 65 years old) in Greece. Pairwise Pearson correlation analysis examined test-retest stability, construct-level associations, and cross-domain coherence across balance, frailty, mental health, sleep quality, quality of life, and lifestyle measures. Strong correlations (r > 0.9) were observed for test-retest measurements, anthropometric clusters, and frailty-balance-fear constructs, while moderate correlations (0.4 < r < 0.7) reproduced known relationships between depression, sleep quality, functional independence, and quality of life. Composite multidomain screening subscores exhibited weak internal coherence and limited test-retest stability, highlighting implementation-sensitive limitations. These findings support correlation analysis as a transparent validation step prior to integrating geriatric assessment batteries into digital health and decision-support platforms. |
| 11:00 | IndoClinNER: Overcoming Medical Prior Bias in Clinical De-identification via Adversarial Surname Injection PRESENTER: Sayantan Mandal ABSTRACT. Nursing shift handovers automated via Retrieval-Augmented Generation (RAG) promise significant reductions in administrative burden. Our prior work demonstrated a local-first RAG framework deployable on consumer CPU hardware, achieving a 43.2% reduction in handover time while maintaining zero patient identifier leakage through deterministic regex privacy controls. However, regex-based de-identification triggers false positives when common Bengali and Hindi names (Joy, Deep, Anal) overlap with English vocabulary and medical terminology, risking desensitization to genuine privacy warnings over time—a precursor to alert fatigue. Conversely, Western-trained Named Entity Recognition (NER) models exhibit what we term Medical Prior Bias, systematically failing to detect these homonymous names in clinical contexts. We present IndoClinNER, a hybrid privacy architecture combining deterministic regex, contextual NER, and Adversarial Surname Injection (ASI)—a novel inference-time technique that exploits learned bigram dependencies by synthetically injecting surnames to force syntactic disambiguation. To address the scarcity of annotated code-mixed clinical data, we developed a Dual-Path augmentation strategy: injecting realistic Bengali names—constructed from 438 unique first names and 86 surnames extracted from West Bengal voter lists—into authentic MIMIC-III nursing notes via a constrained LLM pipeline. On 450 synthetic adversarial sentences across three independent runs, ASI achieved 99.44% recall with 99.44% precision. On 97 expert-generated clinical notes, the system achieved 84.4% recall, with failure analysis confirming that errors occurred predominantly in telegraphic syntax lacking grammatical markers. The system operates at 20 ms mean inference latency on a 33M-parameter model running on consumer CPU hardware, suitable for resource-constrained settings without GPU infrastructure. |
| 11:10 | PrivFusion: A Privacy-preserving Multi-Agent Framework for Harmonizing Distributed Datasets ABSTRACT. The growing availability of clinical data has increased the use of machine learning, yet centralized data aggregation is often infeasible for sensitive health information. Federated Learning (FL) offers a distributed alternative, but its adoption is limited by substantial heterogeneity across institutional datasets, making harmonization a critical but frequently overlooked prerequisite for multi-site analytics. We introduce PrivFusion, a privacy-preserving multi-agent framework that automates the harmonization of structured datasets prior to federated training. PrivFusion uses agents to analyze local data, cluster semantically similar features across sites, and provides iterative transformation recommendations until alignment is achieved. Evaluation across four heterogeneous COVID-19 datasets demonstrates that PrivFusion effectively and efficiently harmonizes multi-site data while substantially reducing manual effort. |
| 11:20 | Vulnerability Audits for Connected Medical Devices PRESENTER: Diego Narciandi-Rodríguez ABSTRACT. Connected medical and wellness devices increasingly act as front-ends for sensitive physiological data and, in some cases, as inputs that may influence health-related decisions. Their typical architecture, comprising embedded sensors, a companion mobile application, and cloud services, expands the attack surface beyond the device itself and makes security failures can become clinically relevant through privacy loss, data integrity issues, and reduced availability. This paper presents a reproducible black-box audit of nine commercially available connected health devices (glucometers, a blood pressure monitor, thermometers, a pulse oximeter, and wearables). We apply a structured evaluation framework derived from ETSI consumer IoT baseline requirements and conformance assessment, adapted to the connected-health context by emphasising data protection, secure onboarding, and update trust. The audit covers 31 test cases organised into eight operational vectors (network exposure, firmware/OS, update mechanisms, communications security, configuration portals, mobile app security, authentication/account security, and physical/auxiliary surface), and uses conservative evidence criteria (no exploitation, minimal reproducible indicators) to support repeatability and responsible disclosure. Across 150 applicable test outcomes, 58.0\% were favourable, 23.3\% were unfavourable, and 18.7\% were inconclusive due to limited technical transparency or insufficient artefacts. We conclude with actionable recommendations for manufacturers and procurement, focused on secure-by-design onboarding, verifiable update pipelines, and measurable privacy controls. |
| 11:30 | Informed, Empowered, and Heard: AI and Retrieval-Augmented Generation as Tools for Equitable Patient Information in Oncology ABSTRACT. Cancer diagnosis confronts patients with an immediate and urgent need for reliable, comprehensible, and personalised information. Yet across Europe, and acutely in Greece, access to high-quality oncology information remains profoundly unequal, shaped by geography, language, socioeconomic status, digital literacy, and the time constraints of overstretched healthcare systems. This paper examines how artificial intelligence (AI), and specifically Retrieval-Augmented Generation (RAG) architectures, can address this inequity by delivering dynamically personalised, evidence-grounded patient information at scale - built by patients and for patient needs, not for hospital workflows. Beyond the technical dimension, the paper argues that the true transformative potential of AI in oncology communication lies not in replacing the human relationship between patients, informal carers, and medical professionals, but in strengthening it to create synergies of collaboration. When patients arrive at clinical encounters better informed, their capacity for shared decision-making is enhanced, their autonomy is respected, and the therapeutic alliance is deepened. The paper additionally draws on neuroscientific evidence on the cognitive burden imposed by cancer diagnosis, arguing that information systems must be designed for the compromised cognitive state of their recipients, not for the idealised attentive reader. Drawing on the Greek oncology context as a specific case within the broader European landscape, and grounding the analysis in established frameworks for patient rights, health literacy, and co-design, this paper presents a principled framework for deploying AI-RAG systems in cancer care in ways that prioritise equity, empowerment, and trust. |
| 11:40 | A Quantum-Inspired Framework for Secure and Intelligent Healthcare Systems in Pandemic-Scale Response PRESENTER: Rayane Elrabaa ABSTRACT. Large scale pandemics put enormous pressure on healthcare systems and reveal the lack of effective mechanisms for both data communication and intelligent decision, support. Most of the time, the existing methods treat networking and artificial intelligence as separate components, which results in delayed delivery of clinically critical data and suboptimal exploitation of heterogeneous medical information. This PhD. thesis proposes a layered, secure, and intelligent framework for pandemic, scale healthcare response, based on two complementary research directions: clinically aware network control and multimodal intelligence. The resulting framework mixes priority, aware data routing with state of, the art multimodal fusion techniques to deliver timely risk assessment, early warning, and clinical decision support even at the peak of a health crisis. This research aims at formulating universally effective design principles for durable distributed healthcare systems that can handle extremely high data volume, unpredictability, and urgent situations like the ones experienced during global pandemics. |
Buffet Menu in Octagon Restaurant
Keynote Title
Techniques for Exploiting Machine Learning and Explainable Artificial Intelligence in Healthcare
Prof. Domenico Talia
University of Calabria, Italy
Noida University, India
Co-founder DtoK Lab
Keynote Abstract
Artificial intelligence techniques and systems are demonstrating their effectiveness in solving problems in many application areas, including healthcare. Over the years, several studies focused on leveraging machine learning and deep learning techniques to identify diseases in patients. However, while these techniques have demonstrated remarkable accuracy in diagnosis, they often operate as “black box” models, meaning they provide outputs without clear explanations of the rationale behind their decisions.
In response to this challenge, explainable artificial intelligence (XAI) has emerged as a research area aiming at providing not only accurate diagnoses but also understandable and interpretable explanations for the decisions made by AI models.
For providing explanations Large Language Models (LLMs) may be exploited. This talk introduces and discusses cutting-edge XAI techniques for healthcare applications, which hold the promise of enhancing trust, enabling clinicians to better understand and contextualize AI results, and ultimately improving patient care. Some significant case studies are presented.
Short Biography
Domenico Talia is a full professor of computer engineering at the University of Calabria, Italy and an Honorary professor at Noida University, India. He is a co-founder of the start-up DtoK Lab. His research interests include Big Data analysis, artificial intelligence, high-performance computing, parallel and distributed machine learning, Cloud computing, social data analysis, and parallel programming models and languages. Domenico Talia has published 10 books and more than 400 papers in archival journals such as Communications of the ACM, IEEE TPDS, IEEE Computer, IEEE TKDE, IEEE TSE, IEEE TSMC-A, IEEE TSMC-B, IEEE Micro, ACM Computing Surveys, FGCS, Parallel Computing, IEEE Internet Computing, and highly reputed conference proceedings. He is a member of the editorial boards of IEEE Computer, ACM Computing Surveys, the Future Generation Computer Systems journal, and other archival journals. He served as a program chair or program committee member of many international conferences. Talia has been the recipient of the Euro-Par Achievement Award 2025. He is a senior member of the Association for Computing Machinery (ACM) and IEEE Computer Society and has been a reviewer for several research agencies and public administrations.
Parquet Lobby Area (in the Main Lobby). Please enjoy your coffee and visit the Showcases Area in the Panorama Room.
| 16:00 | SAFER-Bench: A Comprehensive Benchmarking Framework for Evaluating Medical Federated Retrieval-Augmented Generation (RAG) Systems ABSTRACT. Retrieval-Augmented Generation (RAG) systems are crucial for building reliable AI applications in privacy-sensitive domains such as healthcare. However, existing RAG benchmarks assume centralized data access, while federated learning benchmarks focus primarily on model training rather than retrieval-based workflows. We introduce SAFER-Bench, a comprehensive framework for evaluating Federated RAG systems with approval-based privacy controls in medical settings. SAFER-Bench enables systematic evaluation of retrieval algorithms, merging strategies, and language models under realistic federated constraints where Data Owners maintain complete control over their private medical corpora. We evaluate the framework across five federation configurations (1-4 Data Owners) using five language models spanning three size categories: SmolLM2-1.7B-Instruct, BioMistral-7B and Mistral-7B-Instruct-v0.3, and Llama-3.3-70B-Instruct and OpenBioLLM-Llama3-70B. Our experiments on 200 medical questions from PubMedQA and BioASQ reveal that domain specialization, not model size, determines federation benefits: medical-specialized models benefit from federation at both 7B (+3--5%) and 70B (+5.2%) scales, while general-purpose models consistently perform worse with federation regardless of size (7B: -1 to -2.5%, 70B: -2 to -6%). These findings provide practical, evidence-based guidance for deploying privacy-preserving RAG systems in collaborative healthcare networks. |
| 16:15 | TIME-HF : Transformer for Integrated Medical Embeddings in EHR-based Heart Failure Prediction ABSTRACT. Heart Failure (HF) is a complex clinical condition affecting over 64 million people worldwide (Chen et al.,2025). The prevalence of HF is increasing and it continues to be a major cause of unplanned hospitalisations (Savarese et al.,2023). Despite advances in clinical interventions to anticipate and improve post-diagnosis outcomes, the potential incidence of HF has been particularly challenging to predict before onset. Over the last few years, electronic health records have been increasingly used for HF screening. While reporting promising evaluation metrics, such screening tools are rarely implemented in routine clinical practice because they suffer from significant limitations, e.g.: i. reliance on both primary and secondary care data where linkage is non-trivial, ii. patient representations using simplistic, summarised features over the health timeline, iii. raw diagnosis, procedure, medication codes lacking clinical context used as features and iv. patient visits being weighed identically in model design. In addition, the non- uniformity of temporal gaps between consecutive patient visits is often ignored. Here, we develop TIME-HF, a transformer- based model on semantically rich patient representations to predict incident-HF 6 months ahead of time. Empirical results demonstrate remarkable performance using predictors only from primary care. Our method trained on 18k patients, shows high sample efficiency reaching AUC, precision and recall of 0.82, 0.73 and 0.74 respectively. TIME-HF outperforms traditional methods that use statistical summaries of patient records. Our work serves as a proof-of-concept emphasizing the importance of incorporating temporal features for HF-prediction and evidenc- ing the richness and adequacy of primary care data for early identification of at-risk individuals. |
| 16:30 | Reinforcement Learning Framework for Patient Allocation Between Complex and Regional Oncology Centres in Czechia ABSTRACT. Managing stochastic oncological demand in the Czech Republic challenges planning for Complex Oncology Centres (COC) and Regional Oncology Centres (ROC). Herein, we present a reinforcement-learning framework that learns dynamic patient-allocation policies to maximize treatment efficiency while respecting capacity and other relevant operation constraints. A discrete-time simulator captured key operational features of lung cancer oncology care, including daily arrivals, multi-day treatments, patient deterioration, and COC capacity limits. The agent observed a five-dimensional state vector, i.e. severity, waiting time, length of stay, current COC occupancy, total assigned patients, and outputted probabilities for assigning each patient to COC versus ROC. Across several configurations, the deep Q-network (DQN) policy aligned closely with desired COC capacity, maintained the buffer constraint, and achieved higher cumulative reward than non‑learning baseline policies. The framework might act as a lightweight digital twin, supporting scenario planning, adaptive allocation, and pilot deployment on real data. |
| 16:45 | Benchmarking Signal Reconstruction Pipelines Against Direct Visual Feature Extraction for Digitized ECGs PRESENTER: Łukasz Jeleń ABSTRACT. Although deep learning models demonstrate superior performance in interpreting 1D electrocardiogram (ECG) data, a vast number of clinical records are archived as static images, limiting the deployment of state-of-the-art models. This study investigates whether to address this by reconstructing the 1D signal from the image or by applying computer vision models directly to the image. Utilizing the PTB-XL dataset and ECG-Image-Kit, we benchmark a direct visual feature extraction approach (using EfficientNet-B0 and DINOv3) against a signal reconstruction pipeline (U-Net followed by a custom 1D ResNet). The models are evaluated on biological age regression, sex classification, and pathology detection. Results indicate that the 1D ResNet model operating on reconstructed signals consistently outperforms 2D vision-based models across all tasks, despite having significantly fewer parameters. For instance, the 1D model achieved an age regression mean absolute error (MAE) of 9.09 years compared to over 14.30 years for the 2D models. The findings suggest that 1D temporal representations of ECG data are more information-dense for diagnostics, and that targeted signal processing remains a more robust framework than direct image analysis using general-purpose foundation models. |
| 17:00 | AI-Assisted Secondary Use of Clinical Research Data: A Three-Dimensional Personalization Framework for Scientific Publication Drafting ABSTRACT. The secondary use of clinical research data for scientific publication remains a significant bottleneck in medical research, as collected datasets are rarely transformed into disseminated findings due to the complexity of the data and the writing process. In this paper, we propose an open-source three-dimensional personalization framework (Lab, Personal, Global), to dynamically control these knowledge influences in the paper preparation. The framework extracts ML-based writing style features from reference papers and constructs adaptive prompts based on the set influence levels. The system supports four AI vendors (Groq Llama 3.3 70B, Google Gemini, OpenAI GPT-4, and local GPT-OSS 120B) through a unified abstraction layer. An evaluation with 20 researchers showed significant improvements in writing consistency over baseline methods, along with strong usability scores (SUS 80.78). The framework was validated using real-world clinical data from the RAISE platform, demonstrating its potential to accelerate the transformation of research datasets into publishable scientific output. |
| 16:00 | Towards Mask-Free Multi-Threshold Segmentation of Liver Lesions in CT Images Using a Multi-Objective Evolutionary Approach ABSTRACT. Liver cancer remains a critical global health challenge characterized by high incidence and mortality rates. In clinical practice, the accurate segmentation of liver lesions is often hindered by their morphological complexity and a heavy reliance on time-consuming, expert-driven manual annotations. To address these limitations, this study proposes an automated framework for liver lesion mask generation based on a multi-objective, multi-threshold formulation. The approach simultaneously optimizes four objective functions: Otsu, Minimum Cross-Entropy (MCE), Minimum Error Thresholding (MET), and Tsallis Entropy. Experimental evaluations demonstrate that the proposed methodology achieves a DICE coefficient of 0.751, outperforming several supervised segmentation models within the Liver Tumor Segmentation Benchmark (LiTS). These results highlight the method's potential as an efficient, lightweight, and unsupervised alternative for enhancing diagnostic workflows in medical imaging. |
| 16:15 | Semi-supervised GAN-based Segmentation of CTPA Scans for Pulmonary Emboli Boundary Extraction ABSTRACT. Pulmonary embolism (PE) is a pulmonary vascular disease associated with high morbidity and mortality, where delayed diagnosis can be fatal. While computed tomography pulmonary angiography (CTPA) is the clinical gold standard for PE detection, interpreting these scans manually is time-consuming and error-prone. In this work, we propose a semi-supervised segmentation method, employing generative adversarial networks (GANs) to leverage large pools of unlabeled CTPA scans. The proposed method addresses more effectively the challenges of CTPA image segmentation than previous methods by employing the dice loss and the cross-entropy loss to tackle the highly imbalanced nature of CTPA scans, the feature matching loss to encourage the generator to produce anatomically plausible segmentation maps that are indistinguishable from ground truth, as well as the self-training loss to reinforce learning from highly-confident pseudo-labeled samples. The experimental evaluation on a publicly available dataset shows that the proposed method maintains accurate segmentation, despite a limited number of labeled CTPA scans. with a Dice coefficient of 88.75% and a 95th percentile Hausdorff distance (HD95) of 1.74, outperforming fully supervised baselines. |
| 16:30 | Region-Specific Evaluation of Carotid Artery Segmentation in Cross-Sectional Ultrasound Images Using Deep Learning Models ABSTRACT. Accurate segmentation of the carotid artery in cross-sectional ultrasound images is essential for vascular assessment, intima-media thickness measurement, and plaque evaluation. Although deep learning approaches have shown promising performance in vascular imaging, segmentation remains challenging in anatomically complex regions due to increased structural variability and irregular artery morphology. This study evaluates fully automated, segmentation-based deep learning models for carotid artery segmentation in cross-sectional ultrasound images from the VIPVIZA cohort. The dataset includes anatomically diverse regions such as the common carotid artery (CCA), carotid bulb, and split segments, which introduce morphological variability, non-circular artery shapes, and complex surrounding tissue structures that increase segmentation difficulty. Five deep learning models were investigated: YOLOv8, YOLOv11, U-Net, Attention U-Net, and DeepLabv3+. Images were preprocessed using intensity normalization and contrast enhancement. Performance was evaluated using the Dice similarity coefficient, Intersection-over-Union (IoU), and pixel-wise accuracy. The results demonstrate that while all models achieved stable performance in the relatively homogeneous CCA region (Dice up to 0.924 ± 0.053), segmentation performance decreased in anatomically complex areas such as the bulb and split regions. The split region posed the greatest challenge for all models, with lower Dice scores overall. DeepLabV3 consistently achieved the highest performance across all regions, including the bulb (Dice 0.914 ± 0.055) and split (Dice 0.852 ± 0.112), indicating accurate boundary delineation in morphologically variable areas. YOLO-based models demonstrated competitive performance, though with slightly lower overlap metrics. These findings highlight the importance of evaluating segmentation models across anatomically diverse subregions to ensure robust and reliable performance. |
| 16:45 | Physics-Informed Neural Networks and Domain Decomposition for the Solution of Glioblastoma Invasion 2D model ABSTRACT. Glioblastoma (GBM) is a highly aggressive brain tumor whose proliferation is classically modeled using reaction-diffusion partial differential equations (PDEs). While traditional mesh-based numerical solvers provide accurate approximations, they yield purely discrete solutions. This paper proposes a mesh-free computational framework to simulate the 2D spatio-temporal invasion of GBM using Physics-Informed Neural Networks (PINNs) and Extended PINNs (XPINNs). To address network convergence failures caused by sharp gradients and discontinuous initial conditions, we introduce a logistic smoothing strategy coupled with domain non-dimensionalization. Furthermore, an XPINN architecture is implemented to decompose the computational domain, enabling parallelization and significantly accelerating the training process. Unlike classical numerical methods, the proposed deep learning approach yields a continuous, closed-form surrogate solution, allowing for instant tumor density predictions at any arbitrary time and location without the need for post-hoc interpolation. This framework demonstrates the viability and computational efficiency of domain-decomposed physics-informed learning for analyze the tumor dynamics. |
| 17:00 | RAR-Net: Efficient Recurrent Anatomical Refinement for Medical Image Denoising ABSTRACT. Anatomical refinement refers to the process of enhancing the precision, detail, accuracy, or structural understanding of anatomical models, images, or surgical procedures. In this direction, high-resolution medical image denoising requires preserving anatomical structures while remaining computationally efficient. To address this challenge, we propose RAR-Net, a lightweight recurrent model for anatomical refinement of denoised images that combines gated feature refinement and global consistency modeling. Evaluated on chest x-ray (CheXpert), brain MRI (IXI), and lung CT (LIDC-IDRI) with several different Gaussian noise levels, RAR-Net has comparable PSNR and SSIM performance as other models, and it also does well at high noise levels. Additionally, with only 319.5 k parameters and 0.0046s of inference time per image, RAR-Net provides an effective balance between restoration quality and computational cost, supporting its deployment in many resource-constrained medical imaging applications. |
| 17:15 | Diffusion Models for Stroke Outcome Prediction from Acute CT Perfusion Slices: A Preliminary Study PRESENTER: Mina A. Nessiem ABSTRACT. Predicting final infarct extent from acute CT perfusion imaging is critical for stroke treatment decisions, yet remains challenging due to extreme class imbalance and the complex relationship between acute hemodynamics and tissue fate. While discriminative approaches dominated the recent ISLES'24 challenge, diffusion probabilistic models have not yet been evaluated for final infarct prediction from CT perfusion parameter maps. We evaluate MedSegDiff, a conditional diffusion architecture, on the ISLES'24 dataset using Tmax and CBF inputs (2D slice-based evaluation on fold 0, foreground slices only). With threshold optimization, our approach with 5-sample mean ensembling achieves 13.2% Dice, a 38% relative improvement over a 2D nnU-Net baseline (9.6% Dice), with notably higher sensitivity (0.221 vs 0.086). At a fixed threshold $\tau{=}0.5$, the diffusion model still outperforms nnU-Net (11.0% vs 9.6% Dice). The stochastic nature of diffusion sampling enables effective test-time ensembling, with AUC improving from 0.670 (single sample) to 0.740 (5 samples). |
| 17:25 | Exploring Transformation-Based Radiomic Features in Breast Cancer ABSTRACT. Radiomics has emerged as a promising approach for tumour characterization by extracting quantitative features from medical images. Beyond features computed from original images, transform-based radiomics applies mathematical transformations to highlight complementary image properties. However, the contribution of individual transformations to classification performance remains unclear. In this study, we systematically evaluate the impact of commonly used transform-based radiomic features for breast lesion classification in mammography. Six public datasets with different acquisition settings were analysed. Radiomic features were extracted from original images and from multiple image transformations. Classification models were trained using features from individual transformations, the original radiomics, and a combined set including all transformations. An ablation study and correlation analysis were performed to evaluate the contribution and redundancy of the different transformation types. Results show that while certain transformations tend to outperform original radiomic features more consistently, the common practice of indiscriminately aggregating multiple radiomic transformations may not be optimal and should be dataset-aware. Although the combined model generally provides robust performance, it does not systematically achieve the best results. These findings highlight the importance of critically evaluating transformation-based radiomics and motivate future work on selective features and transformation strategies for breast lesion classification. |
| 16:00 | A Data-Driven Approach to Support Clinical Renal Replacement Therapy ABSTRACT. Objective. The study aimed to develop and evaluate the viability of a data driven approach to predict membrane fouling in critically ill patients undergoing CRRT using machine learning algorithms. Moreover, Counterfactual Analysis is used to detect counterfactuals, i.e. the minimal modifications to the input of the machine learning model required to revert a membrane fouling prediction. Design. The study utilizes time series from Careggi University Hospital ICU. A subset of 16 specific features was recognized, following the recommendations of clinicians, as the most relevant indicators to train machine learning models for predicting membrane fouling dynamics. To keep the approach simple, interpretable, and amenable to detect reliable counterfactuals, the study mainly focuses on a tabular data approach, not involving the time series’ interdependence inherent within each treatment. Since the number of membrane fouling cases is considerably smaller than the overall number of treatments, the ADASYN oversampling method was utilized as preprocessing step for a more equitable representation of the minority classes. A Shapley values based Counterfactual Analysis is applied to the best prediction model, in order to detect counterfactuals whose quality is measured through a proper score. Results. The specific methods adopted include Random Forest, XGBoost and LightGBM. For all these methods a rebalancing rate of 10% with respect to the majority class showed the most balance performance with a sensitivity of 0.776 and aspecificity of 0.963. The performance obtained by all methods showed to be robust with respect to different length forecasting horizons. The tabular data approach revealed not to be a limitation as it outperformed the Long-Short-Term-Memory Recurrent Neural Networks which intrinsically take into account temporal relationships. It is shown that by reducing the features to 5 via a feature selection method, we obtain more simple and interpretable models, without compromising too much the accuracy. The adopted Counterfactual Analysis method is able to detect counterfactuals which seems promising according to the considered quality score. Conclusions. The experimental study provides promising results concerning the adoption of data-driven machine learning methods to predict membrane fouling events during CRRT. The practical implications for clinicians and nurses managing CRRT are significant; additionally, the interpretability afforded by using a reduced set of features enhances the understanding on how the model arrives at their conclusions without sacrifying too much predictive power. The predictions of the model and the associated Counterfactual Analysis can inform therapeutic adjustments leading to more effective patient care while minimizing the risk of membrane fouling. |
| 16:15 | A Temporally Aligned MIMIC-IV ED Pipeline for Multi-Outcome Prediction ABSTRACT. Clinical deterioration among emergency department (ED) patients continues to be a leading cause of preventable morbidity and mortality. Although machine learning models for deterioration prediction are becoming more prevalent, inconsistent dataset construction and insufficient temporal alignment limit reproducibility and introduce temporal leakage. We present a modular and extendable end-to-end pipeline for converting raw MIMIC-IV data into analysis-ready datasets with explicit prediction times, harmonized multi-source event logs, and configurable outcome labels for eight clinical event types. The pipeline creates a cohort of 424,385 adult ED visits (202,080 patients) and generates feature sets for 1-hour (18 features), 6-hours (56 features), and 24-hour (61 features) observation windows, with the option to incorporate MIMIC-IV-ECG electrocardiogram measurements. Validation using logistic regression and gradient-boosted machines with 5-fold stratified cross-validation for benchmark comparison with an established MIMIC-IV-ED cohort shows high concordance (κ = 0.977) and performance parity. All code and SQL transformations are publicly available, allowing reproducible investigation of ED deterioration. |
| 16:25 | Temporal Drift in Action: Evaluating Strategies to Detect and Repair Drift using Real-World Data on Stroke and Myocardial Infarctions ABSTRACT. Clinical prediction models are increasingly embedded within healthcare systems, yet their performance can deteriorate over time due to temporal drift. Such drift may arise from changes in population characteristics, data recording practices, or underlying clinical relationships, and can lead to miscalibrated risk estimates that compromise patient safety. In this study we compare multiple strategies to detect (and potentially repair) temporal drift including the monitoring of performance metrics, analysis of model residuals using statistical process control and Kullback–Leibler divergence, and assessment of input data stability through discrimination error. We apply these approaches to a real-world case study using data from Connected Bradford, evaluating a logistic regression model aligned with QRISK‑2 for predicting 10-year risk of heart attack or stroke. Model behaviour was assessed monthly over a seven year period (2008 to 2015), with recalibration triggered whenever predefined thresholds were exceeded. Our findings show clear evidence of temporal drift, with degradation in calibration and increasing divergence in residual distributions over time. Approaches based on maintaining performance thresholds produced the most accurate and stable predictions, although methods using model residuals offered similar performance. Regular recalibration at fixed intervals demonstrated reasonable accuracy while offering operational advantages due to predictable resource requirements. Methods independent of model residuals, such as discrimination error, detected drift without requiring long term outcome data and may therefore be more viable in contexts with substantial delays before outcomes can be observed. Overall, the results highlight the importance of systematic drift monitoring for clinical prediction models intended for deployment at scale. The software library released as part of this research provides practical tools for detecting and mitigating drift, with clear trade-offs between statistical performance, regulatory considerations, and real-world feasibility. |
| 16:35 | Transformer-Based Cardiovascular Event Prediction Using National Claims Data ABSTRACT. Cardiovascular disease (CVD) remains a global health priority. While administrative claims data offer a longitudinal view of patient history, their high dimensionality and extreme class imbalance make traditional risk prediction difficult. This study evaluates transformer-based architectures (BERT, BioBERT, and ClinicalBERT) for one-year CVD event prediction using a nationwide health registry, the French National Health Data System (SNDS). Using a cohort of 10.7M individuals, we compared the performance of these models against a Random Forest (RF) baseline. While all models showed high accuracy (> 98%) due to low event prevalence (1.2%), domain-adapted transformers (BioBERT and ClinicalBERT) significantly outperformed RF in clinical utility, achieving an F1-score of 16.8% and a 15-fold increase in recall. These results show that although transformer models capture some longitudinal information, overall predictive performance is modest, likely due to the loss of information when structured claims data are converted into text format for transformer models. |
| 16:45 | Predicting Emergency Admissions Following Chemotherapy: A Workload Planning Approach ABSTRACT. Patients undergoing cancer chemotherapy are often at high risk of unplanned hospital admissions, because of their disease or treatment. These admissions are often to a specialist unit with limited capacity, so it is of value in planning resource allocation to know when peaks and troughs in demand are likely to occur. In this study we have sought to produce a machine-learning based model able to assess patients starting chemotherapy and their risk of subsequent unplanned admission, with a view to producing a forecast of likely unplanned admission numbers in subsequent weeks. Our models performs as well as or better than previously reported models, with an AUROC of 0.82 for the best models, but when applied to the expected number of admissions in each week, there remains a substantial amount of unexplained variability in observed admission numbers beyond our predicted values. |
| 16:55 | Real-World Validation of a Predictive Model for Length of Stay in Geriatric Settings PRESENTER: Chiara Dachena ABSTRACT. Early identification of patients at risk of prolonged hospital length of stay (LOS) is crucial for optimizing resource allocation and improving care in geriatric settings. We previously developed an ensemble machine learning model to predict prolonged LOS using routinely collected clinical and care-intensity variables. In the present study, we performed a real-world validation of the model within the same institution. Validation was conducted on two macro-groups: (1) a temporally subsequent cohort meeting the original inclusion criteria and (2) a clinically distinct subgroup. Each macro-group was further stratified into three temporal windows (July–December 2023, full year 2024, January-June 2025). Model performance was evaluated across predefined clinical clusters using Accuracy, Positive Predictive Value (PPV), and Negative Predictive Value (NPV). Weighted Generalized Linear Regression models were applied to assess group-by-day interactions and temporal stability. Overall, Accuracy and NPV remained stable across most clusters and validation groups, with no significant interaction between day and group in the majority of analyses. In contrast, PPV demonstrated greater inter-group variability, with significant day-by-group interactions across clusters. Comparative analyses confirmed that differences in PPV were primarily attributable to variations in outcome prevalence and case-mix rather than systematic degradation of model performance. The model maintained robust predictive performance over time and across clinically distinct cohorts within the same institutional setting. While PPV was sensitive to contextual factors, overall accuracy and negative predictive capacity remained stable, supporting the model’s potential utility as a real-time decision support tool for identifying patients at risk of prolonged LOS in geriatric care. |
| 17:05 | Clustering Type 1 Diabetes Patients based on Short-term Glycemic Dynamics ABSTRACT. The analysis of Continuous Glucose Monitoring (CGM) data using pattern-based methods has emerged as a promising approach for characterizing short-term glycemic dynamics in Type 1 Diabetes. However, in large real-world longitudinal cohorts, highly unequal data contribution across patients introduces significant bias during unsupervised temporal pattern extraction. To address this, we introduce a robust patientlevel subsampling strategy to balance data contribution while preserving patient variability. We apply this methodology to a large cohort of 643 individuals, encompassing over five million two-hour glucose windows. Dynamic Time Warping clustering on the balanced dataset yielded six reproducible temporal glucose patterns, while hierarchical clustering of patient profiles revealed four distinct structural variability phenotypes. These results indicate that mitigating contribution bias preserves the core morphology of temporal patterns while revealing the variations in patient profile distributions, providing a scalable and reliable framework for large-scale CGM data analysis. |
| 17:15 | Exploratory GRU-Based Temporal Modeling of Longitudinal Multimodal Signals for Early Dropout in Psychotherapy ABSTRACT. High dropout rates in psychotherapy, particularly in depression, remain a major clinical challenge and are further exacerbated in vulnerable populations such as hikikomori youth. Machine learning has emerged as a promising strategy to support treatment monitoring; however, applications in Internet-delivered Cognitive Behavioural Therapy (ICBT), especially for early dropout warning, remain limited. In this context, we propose a longitudinal, speech-driven early-warning framework to monitor dropout risk among hikikomori patients undergoing ICBT in a real-world clinical setting. Instead of treating dropout as a static classification endpoint, disengagement is modeled as a discrete-time forecasting problem, where binary supervision is used only to estimate longitudinal risk trajectories across sessions. Our framework integrates multimodal information—self-supervised speech representations, interpretable acoustic descriptors, and behavioral engagement indicators—within a GRU-based temporal architecture. Results indicate that multimodal modeling enhances sensitivity to early disengagement signals and enables clinically meaningful early-warning indicators, achieving a positive-class recall of 0.73 with specificity of 0.50, and detecting risk up to 2 sessions before dropout. These findings highlight the potential of longitudinal speech-based monitoring to support proactive clinical intervention and improve retention in digital psychotherapy. |
A Tutorial of Recent Advances in Medical Video Analysis Systems
Marios S. Pattichis
Professor, Department of Electrical and Computer Engineering
The University of New Mexico, USA.
Abstract: Medical video analysis is undergoing a dramatic transformation that is heavily influenced by recent AI models. The tutorial will explore the evolution of AI models and datasets designed for video analysis. The tutorial will cover foundational concepts in the design of medical video analysis systems and show how these concepts are used in the development of 3D CNN and transformer-based architectures. Moreover, the tutorial will review recent trends, including the use of foundation models and Video Q&A systems applied to instructional medical videos. A summary of current and future work will also be provided.
Bio: Prof. Marios Pattichis received a B.Sc. degree (High Hons. and Special Hons.) in computer sciences, a B.A. degree (High Hons.) in mathematics, an M.S. degree in electrical engineering, and a Ph.D. degree in computer engineering from The University of Texas at Austin, Austin, TX, USA, in 1991, 1991, 1993, and 1998, respectively. He is currently a Professor and Director of online programs in the Department of Electrical and Computer Engineering at the University of New Mexico, where he recently introduced an online M.Sc. degree in Computer Engineering in Applied Machine Learning & Artificial Intelligence Systems Engineering. At UNM, he is also the Director of the Image and Video Processing and Communications Lab (ivPCL).
He has served as a Senior Associate Editor for the IEEE Transactions On Image Processing and a Senior Associate Editor for IEEE Signal Processing Letters, Associate Editor for IEEE Transactions on Image Processing, Pattern Recognition, IEEE Transactions on Industrial Informatics, and a Guest Associate Editor for special issues published in the IEEE Transactions on Information Technology in Biomedicine, and a special issue published in Biomedical Signal Processing and Control. He was elected as a Senior Member of the National Academy of Inventors and a Fellow of the European Alliance of Medical and Biological Engineering and Science (EAMBES) for his contributions to biomedical image analysis.
During the session, Awards for Best Paper and Best Student Paper will be announced and presented.
Rosa Sicilia (Università Campus Bio-Medico di Roma, Italy)