Program for Wednesday, June 3rd

PROGRAM FOR WEDNESDAY, JUNE 3RD

Days:

next day

all days

View: session overview talk overview

08:00-13:00 Session 1: Registration - Day 1

Location: Main Lobby

09:00-10:30 Session 2A: Large Language Models and Conversational AI for Clinical Reasoning

Chair:

Gaddi Blumrosen (Department of Digital Medical Technologies Department of Data Sciences HIT- Holon Institute of Technology, Israel)

Location: Panorama

09:00	Gaddi Blumrosen (Department of Digital Medical Technologies Department of Data Sciences HIT- Holon Institute of Technology, Israel) Gal Malkiel (Department of Data Sciences HIT- Holon Institute of Technology, Israel) Shlomo Hanah (Department of Data Sciences HIT- Holon Institute of Technology, Israel) Hila Gvirts (Department of Psycology, Faculty of Social Sciences and Humanities Ariel University, Israel) Architecture, Evaluation Metrics, and Technical Feasibility of LLMs in Mental State Assessment: A Case Study in PTSD ABSTRACT. Mental state evaluations for conditions such as anxiety, depression, and Post-Traumatic Stress Disorder (PTSD) are essential for effective treatment, yet traditional diagnostic methods often suffer from subjectivity and clinical bias. We propose 'MentState,' an architectural framework leveraging large language models (LLMs) to assess mental health severity from conversational data. This technology features a configurable prompt engine that supports single or conversational text while integrating specific clinical domain knowledge. MentState extracts qualitative mental biomarkers aligned with traditional diagnostic symptoms and enables model refinement through specialized mental health datasets. To evaluate the system, we defined performance metrics including compatibility with existing clinical scores, model self-confidence, and statistical reliability across multiple activations. We demonstrated feasibility using a cohort of 20 subjects exposed to traumatic events, analyzing conversational transcripts derived from real-life online interviews. Our evaluations were benchmarked against the IES-R self-report questionnaire. Results showed over 80% compatibility with the IES-R benchmark across various configurations. Furthermore, the model exhibited high reliability with a coefficient of variation in the range of 3%. This study pioneers a data-driven approach to enhance diagnostic precision and address limitations in clinical availability. By providing objective clinical decision support, MentState underscores the transformative potential of AI to improve patient outcomes and alleviate healthcare professional burnout.
09:15	Yann Hombria Gawior (University of Groningen, Netherlands) Steff Groefsema (University of Groningen, Netherlands) Omer Tarik Ozyilmaz (University Medical Centre Groningen, Netherlands) Matias Valdenegro-Toro (University of Groningen, Netherlands) Verbalized Uncertainty in Medical AI: Differential Diagnosis in Commercial LLMs PRESENTER: Steff Groefsema ABSTRACT. Large Language Models (LLMs) have revolutionized large-scale data processing in healthcare settings, including more efficient and readily available diagnostic models. Differential diagnoses are generated freely and introduced into the clinic by concerned patients. However, many biases are present with limited knowledge about the relationship between the model correctness and the prediction's associated confidence. The current study analyzed three differently purposed LLMs in light of this relationship and visualized the calibration of medical LLMs. Sex, age, and pathology-stratified analyses were also performed separately to evaluate possible biases. Our results indicate that calibration moves from overconfidence to underconfidence when medical LLMs are prompted for a top-5 of likely diagnoses instead of a single prediction. Moreover we found no biases for sex or age-groups, while a bias might exist for specific pathologies. We show that robust evaluation is key for trust in these medical LLMs and more information is required before clinical adoption.
09:30	Aravind Kuruvikkattil (Indiana University Indianapolis, United States) Anagha Pradeep (Indiana University Indianapolis, United States) Veena S. Kumar (Kerala University of Health Sciences, India) Saptarshi Purkayastha (Indiana University Indianapolis, United States) AyurAssist: Bridging Ayurvedic and Biomedical Clinical Knowledge Through Terminology-Grounded LLM Reasoning ABSTRACT. Ayurveda is one of the world's oldest systematized medical traditions, yet standard biomedical ontologies such as SNOMED CT and ICD lack Ayurvedic diagnostic constructs, creating barriers to interoperability and clinical decision support. We present AyurAssist, a clinical decision support system that bridges this terminological gap through a vocabulary-grounded pipeline: biomedical named entity recognition (scispaCy) extracts clinical entities from free-text patient narratives, UMLS normalizes them to SNOMED CT and ICD codes, and fuzzy matching against the full 3,550-term WHO International Terminology for Ayurveda (ITA) constructs a structured Ayurvedic context for a large language model (Qwen3-32B), which performs three-pass clinical reasoning. We validate the system through three complementary experiments. First, benchmarking on BhashaBench-Ayur (14,963 questions) establishes Qwen3-32B (54.2%) as the top-performing model, outperforming the domain-specific AyurParam-2.9B (40.0%). Second, an ablation study over 80 clinician-annotated vignettes demonstrates that the terminology bridge yields a statistically reliable improvement in treatment quality (term-level F1: 0.219 vs. 0.156; near-disjoint 95% bootstrap CIs) and a directionally consistent gain in diagnostic accuracy (80.0% vs. 75.0%), while the bridge alone achieves only 5.0%, confirming that the LLM performs clinical reasoning, while the bridge provides essential vocabulary grounding. Third, inter-rater reliability across four clinicians (two Ayurvedic, two allopathic) establishes ground-truth validity with substantial agreement for modern diagnosis (PABAK= 0.66) and moderate agreement for Ayurvedic diagnosis (PABAK= 0.56). The key insight is that domain-specific vocabulary grounding of a capable general-purpose LLM, rather than domain-specific model training, offers a practical and scalable path toward interoperable integrative medicine informatics.
09:45	Abhinav Bohra (Amazon.com, Inc, United States) Anuj Bohra (Rutgers University - New Brunswick, United States) Multimodal Dual-Encoder Retrieval for Automated ICD Coding ABSTRACT. Accurate International Classification of Diseases (ICD) coding is crucial for large-scale clinical research, documentation, and billing. There are three primary problems with current ICD prediction methods: (1) They are unable to comprehend multimodal patient data because they rely on either structured EHR data or unstructured clinical notes. (2) They also struggle with scalability to a larger amount of ICD codes (9K+ codes in ICD-9), as traditional classifiers need dense output layers and often do not generalize well to long tail rare diseases. (3) They lack transparency for clinical use. To address these challenges, this research proposes a two-stage framework that first retrieves ICD codes using a multimodal dual-encoder retrieval model, where structured and unstructured patient data are integrated through gated fusion. The second stage refines the top-k retrieved candidates with an LLM-based re-ranker that provides ranked codes with clinically relevant explanations. Our experiments show that the proposed approach improves Micro-F1 and Precision over a multimodal dual-fusion classifier baseline. These improvements demonstrate that combining a gated multimodal retrieval system with LLM-based re-ranking is a practical alternative to dense multi-label classification for automated ICD coding.
10:00	Maicon Moraes (UFRGS - Federal University of Rio Grande do Sul, Brazil) Isadora Figueiredo (PUCRS - Pontifical Catholic University of Rio Grande do Sul, Brazil) Victória Marques (PUCRS - Pontifical Catholic University of Rio Grande do Sul, Brazil) Duncan Ruiz (PUCRS - Pontifical Catholic University of Rio Grande do Sul, Brazil) Isabel Manssour (PUCRS - Pontifical Catholic University of Rio Grande do Sul, Brazil) A Conversational Agent for Natural Language Access to Public Health Data PRESENTER: Maicon Moraes ABSTRACT. DATASUS, Brazil's national public health data repository, provides access to large volumes of epidemiological data. Among its systems, the Hospital Information System in Reduced Data format (SIH-RD) records millions of hospitalization procedures. Despite being one of the world's largest epidemiological repositories, SIH-RD microdata remains analytically inaccessible to non-technical practitioners: opaque clinical encodings, ambiguous join paths, and coded value mappings confound general-purpose language models, and no Portuguese natural-language interface for DATASUS currently exists. To the best of our knowledge, we present the first Text-to-SQL system for DATASUS SIH-RD, enabling queries over 18.7 million hospitalization records. To address this, we derive fifteen domain-specific SQL generation rules from systematic SIH-RD schema analysis and embed them in a 9-stage LangGraph pipeline with query routing, automatic table selection, chain-of-thought planning, SQL generation, static validation, and bounded self-repair, requiring no model fine-tuning. We also construct a benchmark of 120 Portuguese healthcare queries stratified into Easy, Medium, and Hard tiers (40 each) with gold-standard SQL over records from two Brazilian states (2008-2023), comprising the first formal Text-to-SQL evaluation on SIH-RD. The agent achieves 93.3% execution accuracy (112/120) with 100% pipeline completion; a controlled single-shot baseline sharing identical model, domain rules, and prompts achieves 90.0% (108/120), with the advantage concentrated in Hard queries (+10.0 pp), isolating the contribution of graph orchestration for complex multi-table reasoning.
10:15	Gabriel Lino Garcia (São Paulo State University, Brazil) Ana Lara Alves Garcia (São Paulo State University, Brazil) Rodrigo Stall Sikora (São Paulo State University, Brazil) João Renato Ribeiro Manesco (São Paulo State University, Brazil) Pedro Henrique Paiola (São Paulo State University, Brazil) Joao Paulo Papa (Sao Paulo State University, Brazil) Enhancing Medical Question Answering in Open LLMs via Inference-Time Ensembles ABSTRACT. Large Language Models (LLMs) have demonstrated strong performance on medical examination benchmarks, particularly when augmented with structured prompting and ensemble-based decoding strategies. Methods such as Chain-of-Thought reasoning, dynamic example retrieval, and self-consistency suggest that meaningful gains may be achievable without parameter updates. However, the extent to which these inference-time strategies enhance clinical reasoning in medium-scale open-weight models remains insufficiently investigated. To examine this question, this work evaluates a MedPrompt pipeline using the Qwen3-8B model on the MedMCQA dataset and compare it against a strict deterministic zero-shot configuration. The approach combines correctness-filtered Chain-of-Thought, k-Nearest Neighbors retrieval of semantically similar examples, and a meta-ensemble that aggregates predictions across temperature scaling and choice shuffling The resulting system improves accuracy from 58.6% to 65.6%, yielding a 7 percentage point absolute gain without fine-tuning. These gains arise primarily from three mechanisms within MedPrompt: filtering Chain-of-Thought demonstrations to include only correct reasoning traces, dynamically retrieving semantically similar examples to guide the model, and aggregating predictions across multiple decoding configurations. This combination enhances reasoning guidance, contextual relevance, and prediction stability without modifying model parameters. These findings support structured inference-time prompting as a reproducible mechanism for improving medical multiple-choice reasoning in open-weight LLMs.

09:00-10:30 Session 2B: Deep Learning Methods for Medical Imaging

Chair:

Alex-Sandro Roschildt-Pinto (UFSC, Brazil)

Location: Atrium A

09:00	Felipe De Jesús Félix Arredondo (Healthen Group, Mexico) Javier De Golferichs García (UPF Barcelona School of Management, Spain) Nezih Nieto Gutiérrez (Tecnologico de Monterrey, Mexico) Gustavo De Los Ríos Alatorre (Tecnologico de Monterrey, Mexico) Eduardo Enrique Gallareta Flores (Tecnologico de Monterrey, Mexico) Luis Alberto Muñoz Ubando (Tecnologico de Monterrey, Mexico) MicroClinic: An Ultra-Low-Parameter Neural Network for Medical Image Analysis ABSTRACT. This paper introduces MicroClinic, an ultra-compact convolutional neural network designed for medical image analysis under extreme resource constraints. While state-of-the-art architectures typically rely on millions of parameters, MicroClinic operates in a regime of 0.4k to 1.1k trainable parameters, following a design philosophy where relational processing substitutes parameter redundancy. The architecture integrates lightweight convolutional blocks with a Convolutional Multi-Head Attention (CMHA) mechanism to capture global spatial dependencies without increasing network depth. Benchmarked across twelve independent clinical datasets, MicroClinic achieves competitive performance, reaching 99.9% accuracy on MedicalMNIST and 95.7% on COVID-19 X-Ray classification, effectively matching models up to 29,000 times larger in parameter count. Beyond efficiency, the reduced scale limits memorization (attain zero error) capacity, potentially improving data privacy, while also enabling structural analysis of the learned representations through metrics such as the Fisher discriminant ratio and mutual information. These results demonstrate that diagnostically meaningful accuracy can be achieved within a minimal parameter budget, enabling AI-assisted screening in underserved and latency-sensitive clinical environments or on hardware with limited computational resources.
09:15	Michael Gadermayr (Salzburg University of Applied Sciences, Austria) Maximilien Tschuchnig (Salzburg University of Applied Sciences, Austria) Lea Maria Stangassinger (Salzburg University of Applied Sciences, Austria) Christina Erhardt-Kreutzer (University Hospital Salzburg, Austria) Sebastien Couillard-Despres (Paracelsus Medical University Salzburg, Austria) Gertie Janneke Oostingh (Salzburg University of Applied Sciences, Austria) Anton Hittmair (Kardinal Schwarzenberg Hospital, Austria) Which Factors Influence Success of Unconstrained Interpolation for Augmentation in Multiple Instance Learning? ABSTRACT. For classifying digital whole slide images in the absence of pixel-level annotations, multiple instance learning methods are applied. Since the number of samples is often low in this setting, data augmentation is important. Here, we investigate unconstrained (multi)linear interpolation between feature vectors, a data augmentation technique that has proven capable of improving the generalization performance of multiple instance learning models. Recent work has shown both high performance and high variability, but it remains unclear which factors influence this performance. To gain insights, we conducted a large study incorporating 9 different dataset configurations, two feature extraction approaches, stain normalization, and two multiple instance learning architectures. We identified consistent behavior when varying the feature extraction method, and the classification model. However, we observed a strong dependence on the underlying image data.
09:30	Andrea Espis (University of Bologna, Italy) Ilaria Pace (University of Bologna, Italy) Pietro Liò (University of Cambridge, UK) Manuela Ferracin (University of Bologna, Italy) Stefano Diciotti (University of Bologna, Italy) Multimodal Deep Learning for Tumor Site Classification: Integrating Histopathology and Gene Mutation Status PRESENTER: Andrea Espis ABSTRACT. Accurate identification of a tumor site of origin is particularly consequential for cancers of unknown primary (CUP), since therapy selection depends on the inferred origin. Whole-slide histopathology images (WSIs) provide rich morphological cues but are known to suffer from acquisition-driven domain shift. Genomic alteration profiles provide complementary molecular evidence about tumor lineage and biology, though they can also vary with sample processing, coverage, and variant-calling choices. Since the two analyses reflect different aspects of tumor biology and are subject to distinct sources of variation, combining them can reduce reliance on any single, potentially biased signal. In this work, we present a multimodal primary site classifier that integrates WSI representations with a compact mutation-status profile derived from a 92-gene panel. Starting from the TOAD framework for tumor origin assessment, we modified the model to (i) focus exclusively on tumor-site classification, removing the task of discriminating metastatic and primary tumor, and (ii) incorporate a binary vector describing the mutation status of 92 genes. Experiments on matched histopathology–genomics cases from TCGA demonstrate a strong interaction between modality utility and resolution domain shift: for a test set composed of digital slides with out-of-distribution microns-per-pixel (mpp) with respect to the training set, the genomics-only model is substantially more robust than the WSI-only model (top-1 accuracy 0.51 vs. 0.27), whereas in-distribution mpp favors histopathology, and multimodal fusion yields the best performance (0.90 top-1 accuracy).
09:45	Martin Ján Kulich (University of Zilina, Slovakia) Dominika Petríková (University of Zilina, Slovakia) Ivan Cimrák (University of Zilina, Slovakia) Automated Ki-67 Proliferation Index Estimation for Deep Learning Applications in Histopathology ABSTRACT. Accurate assessment of the Ki-67 proliferation index is essential in histopathology, yet manual counting of positively stained nuclei in immunohistochemistry slides is time-consuming and subject to inter-observer variability. This paper presents an automated method for Ki-67 index estimation based on morphological image processing and evolutionary optimization. The approach integrates color-based preprocessing, morphological filtering, and distance transform–based cell separation to segment and quantify stained nuclei. A genetic algorithm is used to optimize key parameters to improve segmentation robustness across heterogeneous tissue samples. Experimental results indicate that parameter optimization enhances consistency compared to non-optimized configurations. The proposed method provides an automated alternative to manual assessment and supports label generation for deep learning applications in digital pathology.
10:00	Felipe Soares Muylaert Barroso (Universidade Federal de Santa Catarina (UFSC), Brazil) Luis Otavio Santos (Universidade Federal de Santa Catarina (UFSC), Brazil) Karine Souza da Correggio (Empresa Brasileira de Serviços Hospitalares (EBSERH), Brazil) Aldo von Wangenheim (Universidade Federal de Santa Catarina (UFSC), Brazil) Thiago Zimmermann Loureiro Chaves (Universidade Federal de Santa Catarina (UFSC), Brazil) Roberto Noya Galluzzo (Universidade Federal de Santa Catarina (UFSC), Brazil) Alexandre Sherlley Casimiro Onofre (Universidade Federal de Santa Catarina (UFSC), Brazil) Alex Sandro Roschildt Pinto (Univesidade Federal de Santa Catarina (UFSC), Brazil) Deep Learning Architectures for Automated Classification of Fetal Liver Echotexture in Gestational Diabetes ABSTRACT. Gestational diabetes mellitus (GDM) promotes fetal hyperinsulinemia, leading to fat accumulation in the fetal liver detectable via routine B-mode ultrasound. No published work has applied deep learning to classify fetal liver echotexture for automated GDM-related metabolic assessment. We present a comparative study of six convolutional neural network architectures — ResNet-18, ResNet-34, ResNet-50, EfficientNet-B0, EfficientNet-B4, and EfficientNet-B7 — for binary classification of fetal abdominal circumference above the 75th percentile (CA\,>\,p75) from liver-only ultrasound images. A patient-stratified cohort of 232 patients (110 GDM, 122 controls) provided 1,733 matched liver-only images (from a full dataset of 2,047), with class imbalance addressed via minority oversampling and weighted focal loss. Models were selected by fbeta on validation and evaluated on a held-out test set using AUC, sensitivity, specificity, and F$_1$ with bootstrap 95\% confidence intervals. EfficientNet-B0 achieved the highest AUC of 0.618 (95\% CI: 0.537--0.694). All models achieved very high true sensitivity for the elevated-CA class (0.93--1.00) but very low true specificity (0.00--0.13), indicating over-prediction of the positive class driven by the combined oversampling and weighted focal loss strategy. These results establish the first baseline for automated deep-learning screening of fetal hepatic echotexture in gestational diabetes and motivate larger multi-centre validation.

09:00-10:30 Session 2C: ECG and Physiological Signal Processing

Chair:

Alessandro Aliberti (Politecnico di Torino, Italy)

Location: Atrium B

09:00	Mineth De Croos (University of Peradeniya, Sri Lanka) Nisitha Padeniya (University of Peradeniya, Sri Lanka) Chamath Rupasinghe (University of Peradeniya, Sri Lanka) Isuri Devindi (University of Maryland, United States) Mary M. Maleckar (Tulane University School of Medicine, United States) Roshan Ragel (University of Peradeniya, Sri Lanka) Jørgen K. Kanters (University of Copenhagen, Denmark) Isuru Nawinne (University of Peradeniya, Sri Lanka) Vajira Thambawita (SimulaMet, Norway) Quality Over Quantity: The Impact of Diagnostic Certainty of Data in Deep Learning for ECG Analysis ABSTRACT. Deep learning models for ECG based myocardial infarction detection often prioritize discriminative accuracy over probability calibration, overlooking how the diagnostic certainty of training data impacts clinical reliability. To address this, we trained a CNN-LSTM and a Bidirectional-Mamba-2 architecture on the PTB-XL dataset using three distinct cohorts: exclusively certain cases, all inclusive cases, and a mixed cohort matched for size. When evaluated against a standardized test set of exclusively certain cases, models trained on high certainty labels consistently yielded the most accurate calibration, achieving Expected Calibration Errors of 0.0184 and 0.0339 respectively. Conversely, substituting certain data with uncertain labels in the mixed cohort matched for size significantly increased calibration error and false positive rates. Furthermore, Subclass-aware Integrated Gradients analysis confirmed that models trained on certain data learned physiologically congruent lead importance patterns, demonstrating that label quality, rather than sheer data quantity, fundamentally drives both model trustworthiness and clinical interpretability in medical AI.
09:15	Giulio Maselli (Politecnico di Torino, Italy) Roberto Puntorieri (AlphaWaves srl, Italy) Alessio Viticchié (AlphaWaves srl, Italy) Alessandro Aliberti (Politecnico di Torino, Italy) Safety-Oriented Interpretable ECG Denoising with Regression Tsetlin Machines ABSTRACT. This paper proposes a safety-oriented and interpretable hybrid framework for ECG denoising that decouples artifact intensity estimation from waveform restoration. Three Regression Tsetlin Machines, trained with frequency-isolated features and hard negative mining, estimate normalized intensities of baseline wander, muscle artifact, and power line interference. Calibrated estimates modulate deterministic wavelet-based attenuation and adaptive notch filtering, enabling adaptive yet bounded suppression without direct waveform reconstruction. Evaluated on 1000 windows from 10 unseen patients under realistic mixed-noise conditions, the framework achieves a mean SNR gain of +8.50 dB (95% CI: [+7.90, +9.10]) and satisfies IEC 60601-2-25 amplitude tolerances in four of five noise categories. Integer-only inference requires 56 ms per 4-second window (71× real-time), while clause-level transparency supports feature auditing and physiological validation, enabling predictable behavior and suitability for safety-critical and embedded deployment.
09:30	Marko Miletic (Bern University of Applied Sciences, Switzerland) Murat Sariyar (Bern University of Applied Sciences, Switzerland) Temporal Latent Priors Improve Sequential Generative Modeling of Full-Length 12-Lead ECGs ABSTRACT. Variational autoencoders (VAEs) are increasingly used for electrocardiogram (ECG) representation learning, yet most prior work focuses on short segments or single-lead recordings and rarely evaluates reconstruction, classification, and latent-space behavior jointly on full clinical recordings. This paper presents a modular sequential VAE framework for PTB-XL, a public 12-lead ECG dataset, using standard 10-second resting ECGs that preserve multiple cardiac cycles and clinically relevant morphology. The framework explicitly models time-indexed latent trajectories and compares independent versus temporally structured priors across encoder architectures and objective functions. The results show that temporally structured latent priors improve reconstruction fidelity and latent utilization without degrading multi-label diagnostic performance. Transformer encoders combined with InfoVAE-MMD objectives provide the best balance between reconstruction and representation quality. However, unconditional generation from the learned priors remains physiologically limited. The results highlight the importance of latent dynamics for long multilead ECG modeling and provide guidance for future generative cardiac models.
09:45	José Gilberto Barbosa de Medeiros Júnior (Universidade de São Paulo, Brazil) Leonardo Rossi Luiz (Universidade de São Paulo, Brazil) Gabriele Souza Vilas Boas (Universidade de São Paulo, Brazil) Rafael da Costa Silva (Universidade de São Paulo, Brazil) Ricardo Marcondes Marcacini (Universidade de São Paulo, Brazil) Diego Furtado Silva (Universidade de São Paulo, Brazil) Beyond Single-Beat Classification: Quantifying Arrhythmia in Long-Term ECG via Prevalence Estimation PRESENTER: Rafael da Costa Silva ABSTRACT. Long-term electrocardiogram (ECG) monitoring is essential for determining the arrhythmic burden, a critical clinical metric for diagnosing cardiovascular conditions. Traditionally, this burden is estimated using a Classify-and-Count (CC) approach, which labels individual heartbeats and aggregates results by counting predictions for each label. However, even state-of-the-art Deep Learning classifiers exhibit systematic biases that accumulate over long-term recordings, leading to significant diagnostic inaccuracies. This paper investigates the application of quantification techniques to estimate arrhythmia prevalence in long-term ECG signals from the MIT-BIH Arrhythmia Database. We compare several base classifiers paired with quantification algorithms against a high-performance Deep Learning baseline, LITETime, using the standard CC method. Our results demonstrate a quantification paradox: while the LITETime achieves superior beat-by-beat accuracy, simpler classifiers equipped with quantification adjustment layers, particularly the Expectation-Maximization Quantifier (EMQ), significantly reduce the Mean Absolute Error (MAE) in prevalence estimation. By correcting the systematic bias caused by Prior Probability Shifts, our framework provides a more reliable diagnostic tool for long-term monitoring and wearable cardiac devices.

09:00-10:30 Session 2D: Intelligent Wearables, Assistive Technologies

Chair:

Christos Loizou (Cyprus University of Technology, Cyprus)

Location: Atrium C

09:00	Elizabeth Vidal (Universidad Nacional de San Agustin de Arequipa, Peru) Gustavo Aguilar (Universidad Nacional de San Agustin de Arequipa, Peru) Valentina Chambilla (Universidad Nacional de San Agustin de Arequipa, Peru) Romina Camargo (Universidad Nacional de San Agustin de Arequipa, Peru) Emotion-Aware Assistive System with Wearable Haptic Feedback for Visual Impairment ABSTRACT. Visual impairment limits access to nonverbal social cues and facial expressions, negatively impacting social participation and psychological well-being. Assistive technologies offer a strategy to enable interpersonal interaction. This work presents the development and validation of an integrated biomedical assistive system that performs real-time facial emotion recognition (FER) and delivers vibrotactile feedback to visually impaired users. The proposed platform combines a convolutional neural network enhanced with Convolutional Block Attention Modules (CBAM-4CNN) and a wearable Bluetooth Low Energy (BLE) haptic device. To improve clinical reliability, a data-centric optimization strategy was implemented on the AffectNet dataset, addressing severe class imbalance and label noise through manual inspection and automated Confident Learning. Model accuracy improved from 58.46% to 79.18%, highlighting the importance of data quality in biomedical AI applications. The FER model was deployed on a Raspberry Pi 5 for local inference, enabling real-time processing without cloud dependency. Emotion outputs are wirelessly transmitted to a custom nRF52840-based wearable module that encodes emotional states into distinct vibration patterns. Qualitative validation was conducted with rehabilitation visual impairment users, who confirmed the system's responsiveness and practical feasibility. The proposed solution demonstrates the potential of multimodal AI-driven assistive systems to support social inclusion and partial functional independence in individuals with visual impairments.
09:15	Mengyao Xu (School of Integrated Circuits,Southeast University, China) Chengxin Guo (School of Integrated Circuits,Southeast University, China) Hao Liu (Southeast University, China) An Energy-Efficient Wearable System for AF Detection: LLM-NAS Driven Lightweight Neural Network and Embedded Deployment PRESENTER: Mengyao Xu ABSTRACT. Atrial fibrillation (AF) detection is pivotal for stroke prevention, yet the deployment of robust deep learning models on resource-constrained wearable devices remains a formidable challenge due to excessive computational demands. This paper presents an automated, hardware-aware design pipeline for ultra-lightweight AF classification, driven by Large Language Model-based Neural Architecture Search (LLM-NAS). By translating hardware constraints into structured linguistic priors, we leverage the reasoning capabilities of LLMs to discover an optimized convolutional neural network architecture that synergizes time-frequency dual-branch feature extraction, depthwise separable convolutions, and channel attention mechanisms. To further bridge the gap between algorithmic complexity and embedded efficiency, the discovered model undergoes a two-stage compression suite involving structured pruning and quantization-aware training. Experimental results on the CPSC2021 dataset demonstrate that the resulting model achieves a high F1-score of 0.9674 with only 7.93K parameters and a minimal memory footprint of 7.7 KB—a significant reduction compared to existing state-of-the-art models. Furthermore, we implemented a complete prototype system on an STM32F767 microcontroller, achieving a single-inference latency of 306.20 ms and an incremental power consumption of 0.18 W. This end-to-end validation confirms the feasibility of our LLM-driven methodology for real-time, high-fidelity AF monitoring in next-generation medical IoT devices.
09:30	Dylan Paquié (University of Toulouse, IRIT, France) Sandrine Mouysset (University of Toulouse, IRIT, France) Loïc Treffel (ITO, France) Denis Ducommun (ITO, France) Jérôme Ermont (University of Toulouse, IRIT, France) Cervical Kinematic Recorder: a technological innovation for cephalogyric movement assessment ABSTRACT. This paper presents the Cervical Kinematic Recorder (CKR) application, a tool designed to replicate and capture cervical movements performed in a clinical setting using a virtual reality environment. By extracting accelerometer signals, we developed a classification model capable of accurately distinguishing healthy individuals from patients with left or right cervical dysfunctions. The study also demonstrates how signal-processing features of Yaw, Pitch, and Roll can be used to characterize these patient categories, providing interpretable insights into the kinematic patterns associated with cervical impairments. This approach offers a promising avenue for objective assessment and monitoring of cervical function in clinical practice.
09:45	Dehlela Shabir (Hamad Medical Corporation, Qatar University, Qatar) Jhasketan Padhan (Hamad Medical Corporation, Qatar) Julien Abinahed (Hamad Medical Corporation, Qatar) Khaled Shaban (Qatar University, Qatar) Nikhil Navkar (Hamad Medical Corporation, Qatar) Deep Learning-enabled Monocular, Markerless 3D Pose Estimation of Laparoscopic Tooltips in Box-Trainer Simulators ABSTRACT. Laparoscopic surgical training poses significant skill acquisition challenges, which can be overcome through automated skill assessments and objective feedback mechanisms. Such analysis fundamentally depends on accurate 3D pose estimation of instruments. Existing approaches rely on bulky hardware, additional markers, prior 3D models, or multi-view camera setups. In contrast, our work proposes and evaluates a deep learning-enabled modular pipeline for markerless 3D pose estimation of instruments from a single camera. In this paper we conduct a comprehensive evaluation of both the deep learning detection module and the pipeline's 3D pose estimation performance under unseen and challenging conditions representative of realistic simulator scenarios. The detection model achieves an mAP0.5-0.95 of 99.3% on the test set, and the pipeline demonstrates mean absolute errors (MAE) of 1.53 mm, 1.44 mm, and 1.50 mm along the X, Y and Z axes respectively. Our findings indicate that the proposed pipeline generalizes robustly to realistic simulator conditions, thereby advancing the feasibility of automated skill assessment and practical deployment in laparoscopic training environments, ultimately contributing to improved quality of surgical training and patient outcomes.
10:00	Kantinan Laosuwan (King Mongkut’s University of Technology Thonburi, Thailand) Pawarit Piamprecha (King Mongkut’s University of Technology Thonburi, Thailand) Prakarnkiat Youngkong (King Mongkut’s University of Technology Thonburi, Thailand) Knee Exercise Directional Control and Range of Motion Measurement Device for Physical Therapy Monitoring of Post-Total Knee Arthroplasty Patients ABSTRACT. This paper presents the Smart Knee Rehab, a low- cost system for post-Total Knee Arthroplasty (TKA) rehabil- itation that combines a fully passive mechanical guide with markerless computer vision to measure knee range of motion (ROM) during flexion and extension exercises. The passive rail constrains motion to the sagittal plane to promote safer execution at home, while the camera-based algorithm provides quantitative ROM feedback without wearable sensors. In validation against an iPhone-based inclinometer reference (Measure app on iPhone 13 Pro Max; gyroscope), the system achieved a mean absolute error of 2.38◦ and a root mean square error of 4.09◦.
10:15	Alexander Vicol (University of Toronto, Canada) Steve Mann (University of Toronto, Canada) Chris McGale (University of Toronto, Canada) Gianella Bejar-Alvarez (University of Toronto, Canada) Michel Herrera Viyella (University of Toronto, Canada) Darya Zanjanpour (University of Toronto, Canada) Chan Park (University of Toronto, Canada) Xiaoming Chen (University of Toronto, Canada) Xueqi Yang (University of Toronto, Canada) Bingxuan Yang (University of Toronto, Canada) State-of-Flight: Wearable EEG and Motion Context During Flying or Floating ABSTRACT. We introduce \emph{State-of-Flight}, a hypothesized transition state that may emerge during taxi, takeoff, cruise, and landing, where vestibular input, vibration, engine noise, and anticipatory arousal co-occur. We report an in-cabin field protocol using a Muse S/Athena-class wearable EEG headband (4 channels, 256\,Hz) with synchronized inertial measurements (IMU), together with a reproducible analysis pipeline based on 0.5--30\,Hz preprocessing, 5\,s epoching, and spectral feature extraction. Across a small field dataset ($n=5$ flight participants, $n=2$ control participants, along with a water-based float comparison dataset), pooled epoch-level analyses show elevated theta-beta ratios (TBR) in flight phases relative to controls, with broad variability consistent with real-world wearable recordings. Frontal alpha asymmetry (FAA) also varies across conditions, but is interpreted conservatively due to motion, fit variability, and sign-convention sensitivity. These results are preliminary and primarily demonstrate feasibility, instrumentation, and a reusable analysis scaffold for higher-powered future studies in aviation, neuroergonomics and motion-aware wearable sensing.

09:00-10:30 Session 2E: Special Track on Synthetic Healthcare Data Generation and Clinical Decision Support 1: Federated Learning & Structured Data Generation

Chairs:

Mary M-Maleckar (Simula Research Laboratory, Norway)
Vajira Thambawita (Simula Research Laboratory, Norway)

Location: Megaron B

09:00	Chrysostomos Symvoulidis (Maggioli S.p.A., Greece) Charalampos Geremtzes (Maggioli S.p.A., Greece) Christos Ntais (BYTE Computer S.A., Greece) Spyros Kollias (BYTE Computer S.A., Greece) SEARCH Federated Platform: Federated Data Discovery and Learning for Synthetic Data Generation in Healthcare ABSTRACT. Artificial Intelligence has come a long way in healthcare; due to its ability to analyze complex datasets and uncover patterns. It has the potential to revolutionize clinical decision-making, prediction and diagnostics, and personalized treatment. Not only that, its adoption and performance relies heavily on the data, their quality and quantity. However, in the real-world it may be difficult to obtain, especially when it comes to accessing of heterogeneous data since in most cases the institutions tend to constrain the access to their data or completely restrict it. For the above reasons, Federated Learning (FL) in combination with Synthetic Data Generation (SDG) offers a promising solution. FL allows the training of models without having to move data outside of their owning institutions, or providing access to the actual data to people, while SDG allows for the generation of datasets trained using real data, that also keep their statistical and semantic properties, while also remaining preserving privacy, and data sovereignty. For this reason, in this paper we introduce the SEARCH Federated Platform, a privacy-preserving federated data storage, discovery and learning architecture, which enables the decentralized training of SDG models across healthcare institutions.
09:15	Mikel Catalina (Vicomtech, Spain) Mikel Hernandez (Vicomtech, Spain) Melisa Fernández (Vicomtech, Spain) Antonio Nappa (Vicomtech, Spain) Panagiotis Bamidis (Aristotle University of Thessaloniki, Denmark) Evdokimos Konstantinidis (Aristotle University of Thessaloniki, Greece) Cristina Martin Andonegui (Vicomtech, Spain) Impact of hyperparameter and model selection optimisation in Synthetic Tabular Data Generation models in the healthcare domain PRESENTER: Mikel Catalina ABSTRACT. In recent years, synthetic tabular data has become increasingly prominent in the health domain as it enables researchers and practitioners to develop, validate, and share analytical models without exposing sensitive patient information. By providing a privacy-preserving alternative to real-world data sharing, synthetic data helps mitigate the ethical, legal, and regulatory challenges typically associated with health data access. This article underscores the critical role of effective generative models selection and optimisation for synthetic tabular data generation. To investigate this, a set of 3 open-source health related datasets (breast cancer wisconsin, Heart disease, AML) were used to train the Syntehtic Data Generation models for each dataset with default parameters, including different typologies of generative models, such as, Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), diffusion-based models. Then, hyperparameter selection for each model was performed using a Hill climbing optimisation strategy, to efficiently explore large parameter spaces. To rigorously assess the impact of generative model selection and optimisation, a synthetic data utility evaluation has been applied, which analyse classification tasks with both synthetic data and real data. Results demonstrate that careful hyperparameter optimisation yields substantial improvements in synthetic data quality improving average performance from 11.33\% to 35.12\% depending the model. In particular, we observe notable increases in most datasets and models, confirming that model optimisation is a decisive factor in achieving high-fidelity and high-utility synthetic tabular data.
09:30	Adam Jakobsen (SimulaMet, Norway) Sushant Gautam (SimulaMet, Norway) Hugo Hammer (Oslo Metropolitan University, Norway) Susanne Olofsdotter (Uppsala University, Sweden) Miriam Johanson (Oslo Metropolitan University, Sweden) Pål Halvorsen (SimulaMet, Norway) Vajira Thambawita (SimulaMet, Norway) Knowledge-Guided Retrieval-Augmented Generation for Zero-Shot Psychiatric Data: Privacy Preserving Synthetic Data Generation PRESENTER: Adam Jakobsen ABSTRACT. AI systems in healthcare research have shown po- tential to increase patient throughput and assist clinicians, yet progress is constrained by limited access to real patient data. To address this issue, we present a zero-shot, knowledge-guided framework for psychiatric tabular data in which large language models (LLMs) are steered via Retrieval-Augmented Generation using the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and the International Classification of Diseases (ICD- 10). We conducted experiments using different combinations of knowledge bases to generate privacy-preserving synthetic data. The resulting models were benchmarked against CTGAN and TVAE, both of which rely on real data and therefore entail potential privacy risks. Evaluation was performed across six anxiety-related disorders: specific phobia, social anxiety disorder, agoraphobia, generalized anxiety disorder, separation anxiety disorder, and panic disorder. CTGAN typically achieves the best marginals and multivariate structure, while the knowledge- augmented LLM is competitive on pairwise structure and attains the lowest pairwise error in separation anxiety and social anx- iety. An ablation shows that clinical retrieval reliably improves univariate and pairwise fidelity over a no-retrieval LLM. Privacy analyses indicate that the real data-free LLM yields modest overlaps and low average linkage risk comparable to CTGAN, whereas TVAE exhibits extensive duplication despite a low k-map score. Overall, grounding an LLM in clinical knowledge enables high-quality, privacy-preserving synthetic psychiatric data when real datasets are unavailable or cannot be shared.
09:40	Matteo Bonazzi (Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Italy, Italy) Maria Giulia Bacalini (Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna 40126, Italy, Italy) Enrico Giampieri (Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Italy, Italy) Raffaele Lodi (Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna 40126, Italy, Italy) Gastone Castellani (Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Italy, Italy) Claudia Sala (Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Italy, Italy) Benchmarking Tabular Generative Models for Clinical Data Synthesis PRESENTER: Matteo Bonazzi ABSTRACT. Synthetic data generation (SDG) has emerged as a strategy to address data scarcity, class imbalance, and privacy constraints in medical machine learning. However, most tabular generative models are evaluated on large, balanced benchmarks, while their behavior on small, imbalanced, and privacy-sensitive clinical datasets has received limited attention. In this work, we conduct a comparative study of tabular generative models on a blood-chemistry cohort from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), characterized by limited sample size and uneven class distribution. We compare GAN-based (CTGAN, CTAB-GAN), VAE-based (TVAE), diffusion-based approaches (TabDDPM, STaSy), and Forest Diffusion, a regression tree-based framework leveraging conditional flow matching (CFM). Evaluation encompasses distributional fidelity, correlation and discrimination metrics, empirical privacy risks, and computational cost. TVAE achieves the shortest training time (35.9 s), while Forest Diffusion attains the lowest Wasserstein distance (1.05) and shares the highest coverage (0.98) with TVAE. CTAB-GAN attains the highest distribution score (80), while TabDDPM achieves the highest correlation score (54). The highest discrimination performance (78) is observed for CTAB-GAN. Privacy risks vary across methods: CTGAN exhibits the lowest risks, whereas TabDDPM shows consistently higher linkability and inference risks relative to the other approaches. To further assess class-specific fidelity, we compute the KL divergence between real and synthetic age distributions stratified by diagnosis, given the central role of age in Alzheimer’s disease progression. Overall, the results indicate that TVAE and Forest Diffusion provide competitive performance across the evaluated quality and privacy metrics, while CTGAN and STaSy consistently underperform, underscoring the importance of benchmarking generative models under realistic clinical conditions.
09:55	Amanda Bertgren (Umeå Universitet, Sweden) Fredrik Öhberg (Umeå Universitet, Sweden) Paolo Soda (Umeå Universitet, Italy) Ulf Näslund (Umeå Universitet, Sweden) Patrik Wennberg (Umeå Universitet, Sweden) Christer Grönlund (Umeå Universitet, Sweden) Evaluating Schema-Awareness in a Transformer Model for Tabular Synthetic Data Generation of Electronic Health Records PRESENTER: Amanda Bertgren ABSTRACT. Synthetic data can facilitate sharing of electronic health records whilst preserving the privacy of the individuals represented in the original data set. However, conventional tabular models for synthetic data generation (SDG) are structurally limited by data quality and display inflexibility to pre-training when the dataset is small. Transformer models have potential to overcome these challenges, but how to optimise their adaption to SDG is still largely unexplored. Therefore, we evaluated the effect of schema-awareness embedded in weights and prompts in four experiments using a Phi-2 model fine-tuned on a dataset within cardiovascular disease risk, in comparison to a CTGAN model. Our results showed that schema-awareness did not have a notable effect in neither weights nor prompts for synthetic data generation. However, all experiments based on the Phi-2 model displayed higher performance in fidelity for numerical variables in comparison to CTGAN. All models performed similarly within privacy preservation. Future studies should explore if schema-awareness have an effect when pre-training the model on multiple datasets and whether more or other type of schematic information have an effect on SDG performance.
10:10	Michael Vasilakakis (University of Thessaly, Greece) Dimitris Iakovidis (University of Thessaly, Greece) Interpretable Synthetic Medical Tabular Data Generation for Clinical Decision Support Using Fuzzy Cognitive Maps ABSTRACT. Synthetic medical tabular data generation has become essential for developing and validating computer-based medical systems (CBMSs) when real clinical data is restricted due to privacy, ethical, or data availability limitations. Existing probabilistic and deep generative models often lack interpretability and fail to preserve clinically meaningful dependencies, limiting their suitability for safety-critical applications. This paper proposes a novel application of Fuzzy Cognitive Maps (FCMs) in a framework for synthetic medical tabular data generation with explicit causality and privacy preservation. Clinical features are described using linguistically interpretable fuzzy sets, and inter-feature dependencies are encoded as FCM edge weights computed from fuzzy set intersections. Synthetic patient records are generated by propagating randomly initialized linguistic activation vectors through the FCM until convergence, followed by defuzzification to produce clinically coherent numerical values. The approach natively handles mixed data types, and domain constraints common in health records. Experimental evaluation on UCI medical benchmark datasets demonstrates competitive performance under a Train-on-Synthetic-Test-on-Real (TSTR) protocol. The proposed method achieves accuracy of up to 0.81 and AUROC of up to 0.90 on the Heart Disease dataset, matching or exceeding TVAE and Gaussian Copula baselines while running exclusively on CPU. Fidelity metrics including KS Complement (up to 0.91) and Correlation Similarity (up to 0.95) confirm strong statistical coherence, and DCR Baseline Protection scores consistently exceed those of TVAE, confirming adequate privacy guarantees. These results demonstrate that causally grounded, interpretable fuzzy modeling offers a computationally efficient and transparent alternative to deep generative models for trustworthy synthetic data generation in CBMSs.
10:20	Jordi Carrere-Molina (ARCADIA Research Group - Fundació IDIAPJGol, Spain) Jordi Duch (Departament d’Enginyeria Informàtica i Matemàtiques - Universitat Rovira i Virgili, Spain) Roger Mallol-Parera (ARCADIA Research Group - Fundació IDIAPJGol, Spain) Towards A Unified and Explainable Record-Level Metric for Synthetic Data Evaluation ABSTRACT. The generation of tabular synthetic data has become a critical tool for data sharing and analysis in privacysensitive domains. However, evaluating the trade-off between data utility and privacy remains a significant challenge. In this paper, we propose a novel metric designed to provide a unified evaluation of tabular synthetic datasets at the real data recordlevel, allowing for the identification of the specific proportion of the original data that is both accurately modeled and adequately protected, offering an explainable benchmark for evaluation. In particular, and in contrast with aggregate-based evaluations, our approach assesses the quality of the synthesis for each individual record by integrating probabilistic modeling with privacy-preserving constraints. Overall, by evaluating how well each real record is represented by the synthetic distribution while maintaining sufficient distance to prevent disclosure, the proposed metric provides researchers and practitioners with a robust and actionable tool to validate synthetic data before deployment.

09:00-10:30 Session 2F: Special Track on Multimodal Artificial Intelligence in Healthcare 1: Physiological Signals

Chairs:

Michela Gravina (University of Naples Federico II, Italy)
Angel Mario Garcia-Pedrero (Universidad Politécnica de Madrid, Spain)

Location: Megaron C

09:00	Tiziana Currieri (Department of Biomedicine, Neuroscience and Advanced Diagnostics (BiND), University of Palermo, Italy) Francesco Prinzi (Department of Biomedicine, Neuroscience and Advanced Diagnostics (BiND), University of Palermo, Italy) Antonio Perna (Department of Psychology, Università degli Studi della Campania “Luigi Vanvitelli”, Italy) Maria Santina Ler (Department of Psychology, Università degli Studi della Campania “Luigi Vanvitelli”, Italy) Laura Ferraro (Department of Biomedicine, Neuroscience and Advanced Diagnostics (BiND), University of Palermo, Italy) Gennaro Cordasco (Department of Computer Science, University of Salerno, Italy) Anna Esposito (Department of Computer Science, University of Salerno, Italy) Salvatore Vitabile (Department of Biomedicine, Neuroscience and Advanced Diagnostics (BiND), University of Palermo, Italy) Cross-Modal Coordination in Reading: An Interpretable Audio–Video Fusion Pipeline for Cognitive Screening ABSTRACT. Short reading recordings provide a practical basis for cognitive screening and prevention, but many speech-based systems compress the signal into rough global aggregates or rely on deep representations with limited transparency; conversely, face-video cues may add complementary digitalized information but are often less robust when used alone. We propose an interpretable audio--video pipeline that explicitly models cross-modal coordination during reading. Audio is represented through eGeMAPS descriptors, while video is processed with MediaPipe FaceMesh to derive frame-level mouth-opening and head-tilt dynamics. An audio-energy proxy is time-aligned to the video stream to compute synchrony descriptors (Pearson correlation and cross-correlation lag) and misalignment episodes, i.e., intervals in which acoustic energy and mouth activity provide discordant evidence of speech activity. Subject-level representations summarise robust statistics, temporal-block dynamics, and misalignment ratios, counts, and durations. We evaluate audio-only, video-only, and multimodal fusion models (early feature concatenation and late decision fusion) under leakage-safe, subject-level 5-fold stratified cross-validation. Early fusion with a Random Forest achieves the highest overall accuracy (ACC=0.752), while late fusion provides the most favourable class-balanced trade-off (BACC=0.678; AUROC up to 0.739), indicating complementary information across modalities. Overall, explicit and clinically readable coordination descriptors support an effective and interpretable multimodal approach for prevention in reading-based cognitive screening.
09:15	Sarun Varghese (Otto-von-Guericke-University Magdeburg, Germany) Kumar Sai Jonnala (Otto-von-Guericke-University Magdeburg, Germany) Myra Spiliopoulou (Otto-von-Guericke-University Magdeburg, Germany) LOOM Net: An Importance-Aware CNN--Transformer for Multimodal Physiological Signals on Wearables PRESENTER: Sarun Varghese ABSTRACT. Deep learning has transformed the analysis of physiological signals, but multimodal wearable modeling still faces some major challenges, namely in handling different sampling rates across modalities, exploiting cross–modal relationships, managing the computational and deployment cost of individual sensors, and generalizing across subjects under limited training data. We propose Leave–One–Out Multimodal Network (LOOM Net), an importance-aware CNN–Transformer for multimodal physiological signal processing, demonstrated on binary stress detection (stress vs. no stress). LOOM Net resamples modalities to a unified rate and applies early fusion to preserve short and long term cross–modal patterns, then uses a Transformer encoder with self-attention. We integrate Leave–One–Modality–Out analysis to quantify sensor value and guide deployment choices. We evaluate on the WESAD dataset, which contains multiple physiological modalities and provides ground truth labels. Under Leave–One–Subject-Out, LOOM Net shows accurate stress detection with short 4–8 sec windows, while Leave–One–Modality–Out reveals the relative contribution of each modality, highlighting respiration, electrocardiogram, and electrodermal activity as most influential modalities
09:30	Irene Iele (Università Campus Bio-Medico di Roma, Italy) Floriano Caprio (Università Campus Bio-Medico di Roma, Italy) Paolo Soda (Università Campus Bio-Medico di Roma, Italy) Matteo Tortora (University of Genoa, Italy) Hybrid Quantum Neural Network for Multivariate Clinical Time Series Forecasting PRESENTER: Irene Iele ABSTRACT. Forecasting physiological signals can support proactive monitoring and timely clinical intervention by anticipating critical changes in patient status. In this work, we address multivariate multi-horizon forecasting of physiological time series by jointly predicting heart rate, oxygen saturation, pulse rate, and respiratory rate at forecasting horizons of 15, 30, and 60 seconds. We propose a hybrid quantum-classical architecture that integrates a Variational Quantum Circuit (VQC) within a recurrent neural backbone. A GRU encoder summarizes the historical observation window into a latent representation, which is then projected into quantum angles used to parameterize the VQC. The quantum layer acts as a learnable non-linear feature mixer, modeling cross-variable interactions before the final prediction stage. We evaluate the proposed approach on the BIDMC PPG and Respiration dataset under a Leave-One-Patient-Out protocol. The results show competitive accuracy compared with classical and deep learning baselines, together with greater robustness to noise and missing inputs. These findings suggest that hybrid quantum layers can provide useful inductive biases for physiological time series forecasting in small-cohort clinical settings.
09:45	Domenico Rossi (University of Salerno, Italy) Giovanni Carbone (University of Salerno, Italy) Alessia Auriemma Citarella (University of Salerno, Italy) Fabiola De Marco (University of Salerno, Italy) Luigi Di Biasi (University of Salerno, Italy) Huiru Zheng (Ulster University, UK) Genoveffa Tortora (University of Salerno, Italy) Digital Twin-based 3D Reconstruction for Real-Time Body Composition Estimation Using 2D Images and Deep Learning-based Segmentation ABSTRACT. Accurate and accessible body composition assessment is a key requirement for digital health, preventive medicine, and remote patient monitoring. Traditional techniques such as Dual-Energy X-ray Absorptiometry and bioelectrical impedance analysis are often costly, invasive, or sensitive to strict measurement protocols, limiting their scalability outside clinical settings. This paper presents a Digital Twin (DT)-based framework for real-time and longitudinal body composition monitoring, implemented through a mobile application that enables data acquisition, storage, and temporal tracking of patient-specific measurements.The proposed approach relies exclusively on two-dimensional images acquired via smartphone to reconstruct a subject-specific three-dimensional anthropometric DT using a geometry-driven pipeline based on orthogonal silhouettes, avoiding statistical body templates and pose-based priors. From the reconstructed DT, circumferential measurements are automatically extracted and used to estimate fat mass and lean body mass through validated anthropometric equations, without model training or cohort specific calibration. The mobile infrastructure supports repeated acquisitions over time, enabling historical analysis and continuous updating of the DT representation. The framework was validated on 40 participants using manual anthropometric measurements as reference. Experimental results show a global anthropometric accuracy of 91.22% and an average body composition estimation accuracy of 88.8%, demonstrating strong agreement with ground truth measurements. Compared with recent pose-based digital anthropometry approaches, the proposed method enables volumetric and circumferential analysis, which is essential for body composition assessment, while maintaining low computational complexity and suitability for mobile and real-world monitoring scenarios. These results highlight the potential of geometry-driven DTs as a scalable, non-invasive solution for longitudinal body composition monitoring.
10:00	Martina Iammarino (Department of Information Science and Technology, Telematic University Pegaso, Naples, Italy, Italy) Lerina Aversano (Department of Agricultural Sciences, Food Natural Resources and Engineering, University of Foggia, Foggia, Italy, Italy) Antonella Madau (Department of Agricultural Sciences, Food Natural Resources and Engineering, University of Foggia, Foggia, Italy, Italy) Explainable Multimodal Learning for Fetal Electro-Mechanical Signal Modeling ABSTRACT. Noninvasive fetal cardiac monitoring is hampered by the poor quality of the fetal electrical signal and the mismatch between electrical and mechanical acquisition modalities. In this work, we propose a multimodal deep learning model for high-resolution fetal Doppler profile reconstruction, integrating multichannel abdominal ECG and degraded pulsed Doppler. The architecture consists of two encoders dedicated to the electrical and mechanical components, followed by a gated fusion mechanism that dynamically learns the contribution of each modality. The model is optimized using a hybrid loss that combines mean square error and Pearson correlation to preserve both numerical accuracy and morphological consistency of the signal. Results on the NINFEA dataset show that the multimodal approach outperforms unimodal configurations, achieving a Pearson correlation of 0.78. The explainability analysis highlights that the model focuses on the fetal QRS complexes and the systolic phases of the Doppler signal, suggesting the learning of the physiological electro-mechanical relationship.

10:30-11:00Coffee Break

Parquet Lobby Area (in the Main Lobby). Please enjoy your coffee and visit the Poster Area in the Panorama Room via the terrace.

10:30-11:00 Session 3: Poster's Presentations Day 1

Chair:

Constantinos Pattichis (University of Cyprus, Cyprus)

Denis Moser (Bern University of Applied Sciences, Switzerland)
Kerstin Denecke (Bern University of Applied Sciences, Switzerland)

Aspect and Effect Polarity Classification in German Clinical Text with a Two-Stage Transformer Pipeline

ABSTRACT. Clinical narratives contain valuable information for monitoring longitudinal trends and analysing treatment outcomes. Aspect-based sentiment analysis can help by estimating the polarity of an expressed effect for a specific clinical aspect. We test whether a German domain-specific BERT model can be fine-tuned for a two-stage aspect-based effect polarity pipeline in a controlled synthetic setting. We fine-tuned two sentence-level classifiers applied sequentially: (i) aspect identification (eight classes) and (ii) polarity classification (three classes). The pipeline was evaluated on Dutch nurse letters with sentence-level topic annotations, automatically translated into German, using GerMedBERT as the backbone. End-to-end, it achieved 83.7\% accuracy, 76.9\% Macro-F1, and 83.5\% Weighted-F1; the aspect classifier reached 92.5\% accuracy and the polarity classifier 89.9\%. These results indicate that German clinical BERT models can support extraction of aspect-based effect polarity for clinical analytics.

Luis Muñoz Saavedra (Universidad de Sevilla, Spain)
Francisco Luna-Perejón (Universidad de Sevilla, Spain)
Filareti Lagkani (Aristotle University of Thessaloniki, Greece)
Miguel Civit (Universidad Loyola, Spain)
Manuel Domínguez-Morales (Universidad de Sevilla, Spain)
Lourdes Miró Amarante (Universidad de Sevilla, Spain)

Optimized emotion recognition in naturalistic gaming enviroment using facial action units

ABSTRACT. We evaluate emotion recognition during naturalistic gameplay using facial Action Unit (AU) intensities as predictive features. Using the PaGER-Sync ADICVIDEO dataset (25 university students playing commercial video games), we predict frame-level emotion labels generated by a commercial system. Across multiple classifiers, an optimized Random Forest using 17 AUs achieves 95.1% accuracy (macro F1-score = 0.93). A reduced 12-AU subset retains 93.7% accuracy, improving deployment efficiency. We use SHapley Additive exPlanations (SHAP) to interpret model decisions; AU10 (Upper Lip Raiser) and AU12 (Lip Corner Puller) are among the most discriminative features. Compared with a parallel EDA/BVP approach on the same dataset (86.01% accuracy), AU features provide stronger discrimination, whereas physiological signals offer a more privacy-preserving, lightweight alternative. These results support interpretable, efficient models for affective computing in interactive environments.

Milana-Solomiia Zhuhaievych (Munster Technological University, Ireland)
Jinghua Ye (Munster Technological University, Ireland)

Explainable Machine Learning for Automated Acne Severity Grading With Small-Scale Dataset

ABSTRACT. Automated grading of acne severity using deep learning has recently attracted attention, particularly through two-stage pipelines that combine object detection and classical machine learning. While previous studies have reported high classification accuracy, methodological aspects such as reproducibility, training sensitivity, and feature–label independence remain insufficiently examined. In this work, we conduct a reproducibility-focused evaluation of a YOLOv8-based acne severity grading pipeline using the publicly available ACNE04 dataset with independently annotated severity labels. Lesion-level detections are aggregated into structured image-level features, including density, lesion count, and average confidence, which are subsequently evaluated using multiple classical classifiers. The best performance is achieved using YOLOv8m with Logistic Regression, reaching an accuracy of 0.787 under five-fold cross-validation. We further analyze training duration effects, overfitting behavior, classifier sensitivity, and potential risks of label leakage. Grad-CAM visualizations are used to qualitatively assess model attention and validate that predictions are primarily driven by acne-affected regions. The findings highlight the importance of rigorous evaluation protocols in small-scale medical datasets and demonstrate that reported performance can vary substantially depending on training configuration, feature design, and classifier choice.

Fabrício Barth (Insper Institute of Education and Research, Brazil)
Almir Bitencourt (A.C. Camargo Cancer Center, Brazil)
Vinicius Felipe (A.C. Camargo Cancer Center, Brazil)
Raul Ikeda (Insper Institute of Education and Research, Brazil)
Rodrigo Patelli (Insper Institute of Education and Research, Brazil)
Tiago Silva (Insper Institute of Education and Research, Brazil)

Improving Breast Magnetic Resonance Imaging Workflow Through Deep Learning–Based Exam Prioritization

ABSTRACT. This study proposes a deep learning–based framework for automated prioritization of breast magnetic resonance imaging (MRI) examinations aimed at optimizing radiological workflow. The approach is structured as a two-stage pipeline: (i) automatic breast localization in maximum intensity projection (MIP) images using a YOLO11 detector trained on 920 publicly available images and externally validated by expert radiologists; and (ii) classification of the extracted breast regions using a Vision Transformer (ViT-384) architecture to identify examinations requiring clinical prioritization. The classification network was trained on a private dataset comprising 1,782 annotated examples. The ViT-384 model achieved superior performance, with a mean F1-score of 0.769 and a mean AUC-ROC of 0.805. Model interpretability and cluster-based error analysis were incorporated to characterize failure modes and iteratively refine dataset composition through targeted sampling. These findings indicate that the proposed framework is a viable strategy for exam-level prioritization, with potential to enhance efficiency and resource allocation in high-volume clinical environments.

Mauricius Santos (Federal University of Santa Catarina, Brazil)
Henrique Mosquér (Federal University of Santa Catarina, Brazil)
Vinicius Petrolini (Federal University of Santa Catarina, Brazil)
Aldo von Wangenheim (Federal University of Santa Catarina, Brazil)
Douglas Macedo (Federal University of Santa Catarina, Brazil)
Eduardo Beckhauser (Federal University of Santa Catarina, Brazil)
Alex Sandro Roschildt Pinto (Federal University of Santa Catarina, Brazil)

Comparative Evaluation of CNN Architectures for Asbestos-Related Disease Screening in Public Chest X-ray Datasets

ABSTRACT. Asbestos exposure remains associated with high global morbidity and mortality. The diagnosis of Asbestos-Related Diseases (ARDs), such as asbestosis and pleural thickening, is challenging due to the subtlety and heterogeneity of radiological patterns, particularly in early stages. From a computational perspective, the scarcity of annotated public datasets and class imbalance hinder robust feature learning and model generalization.

This study proposes a convolutional neural network (CNN) based pipeline for ARD detection in chest radiographs, incorporating data augmentation to mitigate class imbalance and GradCAM to support interpretability analysis. Images were obtained from the public PadChest and NIH ChestX ray14 datasets. The evaluated classes included Normal, Asbestosis Signs, Fibrosis, and Pleural Thickening, along with additional classes to increase dataset variability. The dataset was partitioned into 70% for training, 15% for validation, and 15% for testing. Experiments were conducted using ResNet34, ResNet50, EfficientNet, and DenseNet121 architectures.

For the Asbestosis Signs class, ResNet50 achieved an F1 score of 0.86, while EfficientNet and DenseNet reached a precision of 1.00 with an F1 score of 0.60. For Pleural Thickening, EfficientNet achieved an F1 score of 0.68. For Fibrosis, DenseNet achieved an F1 score of 0.57. Qualitative GradCAM analysis showed that, in some cases, model activations extended to regions outside the lungs, suggesting the influence of contextual artifacts.

These results indicate that CNN based models can support automated ARD screening, although limitations related to data scarcity and class imbalance remain.

Nishat Nayla Labiba (Bellini College of Artificial Intelligence, Cybersecurity and Computing, University of South Florida, United States)
Raahil Patel (USF/FOI Orthopaedic Surgery, United States)
Pablo Sanchez-Sotelo (USF/FOI Orthopaedic Surgery, United States)
Lawrence O. Hall (Bellini College of Artificial Intelligence, Cybersecurity and Computing, University of South Florida, United States)
Dmitry B. Goldgof (Bellini College of Artificial Intelligence, Cybersecurity and Computing, University of South Florida, United States)
Mark A. Frankle (USF/FOI Orthopaedic Surgery, United States)

Automated Video Classification during a Deltopectoral Approach for Primary Shoulder Arthroplasty

ABSTRACT. This study presents a novel framework for classifying the presence or absence of significant intraoperative blood loss during shoulder arthroplasty. A total of 107 high-resolution videos with labels (53 with intraoperative blood loss versus 54 without) were identified by an experienced orthopedic surgeon. The videos were pre-processed and sampled at 1 frame per second, resulting in 25,655 frames grouped into windows of 10-frames. GPT-4o, a multimodal large language model (LLM), was prompted to detect the presence or absence of significant blood loss using structured prompt engineering with image based explanations. A threshold of ≥ 20% of blood flow positive sequences was used for video-level classification. Tuning prompts on 20 videos yielded an accuracy of 90%, with strong generalization accuracy of 81.6%, on the remaining 87 videos. This work highlights the viability of LLM-based surgical video classification with minimal task-specific fine-tuning.

Paulína Šebeňová (Faculty of Management Science and Informatics, University of Žilina, Slovakia, Slovakia)
Lucia Piatriková Ondrigová (Faculty of Natural Sciences, Matej Bel University, Slovakia, Slovakia)
Ivan Cimrák (Faculty of Management Science and Informatics, University of Žilina, Slovakia, Slovakia)

Improving Quality of Variational Autoencoder Outputs for Ki67 Histological Image Generation

ABSTRACT. Ki67 immunohistochemical staining serves as a proliferation marker in cancer diagnosis, but acquiring large datasets remains challenging. Variational Autoencoders (VAEs) offer a promising approach for generating synthetic Ki67-stained images; however, standard VAE architectures produce blurred outputs, which limit their applicability in medical imaging tasks. This work systematically investigates methods to improve the sharpness of VAE-generated Ki67-stained histopathological images while maintaining stable training dynamics. We evaluate three complementary approaches: (1) alternative reconstruction loss functions, (2) latent space dimensionality scaling, and (3) explicit sharpness regularisation. Our experiments demonstrate that these modifications yield measurable improvements in output sharpness. Our findings show that simple modifications to the standard VAE model can substantially improve output quality without requiring complex adversarial training mechanisms, providing a practical approach for medical image generation or reconstruction in resource-constrained settings.

Jie Lin (National Taiwan University, Taiwan)
Weijie Sun (University of Alberta, Canada)
Sunil Kalmady (University of Alberta, Canada)
Anita Khalafbeigi (University of Alberta, Canada)
Abram Hindle (University of Alberta, Canada)
Padma Kaul (University of Alberta, Canada)
Russell Greiner (University of Alberta, Canada)

Harmonized Interpretable ECG Waveform Features for Robust Cross-Dataset Clinical Prediction

ABSTRACT. Electrocardiograms (ECGs) are a ubiquitous signal for cardiovascular risk prediction, yet models often fail to transfer across hospitals, due to differences in protocol, population, and measurement. We benchmark cross-dataset generalization on three tasks --- heart failure classification, 30-day all-cause mortality, and 30-day mortality among sinus-rhythm ECGs --- using two large cohorts (MIMIC-IV and the Alberta Cohort). To eliminate vendor-specific measurement mismatch, we build a harmonized, interpretable feature representation computed directly from raw waveforms: FeatureDB morphology/heart-rate-variability summaries plus compact time--frequency descriptors (autoregressive and wavelet features). We train XGBoost models on this unified feature space, which we evaluate on patient-disjoint internal testing and bidirectional external testing. Across tasks, internal AUROC is 0.79--0.82 and cross-dataset AUROC is 0.74--0.78, with operating points and calibration shift substantially under transfer. These results show that a consistent, waveform-derived feature interface preserves performance while enabling realistic external validation, which highlights the need for deployment-aware thresholding (e.g., target-site recalibration) in ECG-based ML.

Ikhlas Enaieh (Télécom Paris, France)
Olivier Fercoq (Telecom Paris, France)
Angel Mario Garcia-Pedrero (Universidad Politécnica de Madrid, Spain)

On the explainability of max-plus neural networks

ABSTRACT. We investigate the explanability properties of the recently proposed linear-min-max neural networks. At initialization, they can be interpreted as k-medoids with the infinity norm as a distance. Then, they are trained using subgradient descent to better fit the data. The model has been shown to be a universal approximator. Yet, we can trace the decision process because a single most activated neuron is responsible for the value of the output. Using this property, we designed a pixel fragility measure that determines whether changes to a single pixel may be responsible to a change in the classification output. Experiments on the PneumoniaMnist dataset show that this explanation for the output of the neural network compares favorably to SHAP and Integrated Gradient.

Wendy Bravo (Universidad Politécnica Salesiana, Grupo GIETEC, Ecuador)
Vinicio Changoluisa (Universidad Politécnica Salesiana, Grupo GIETEC, Ecuador)

Efficient working memory characterization from EEG using phase based deep learning analysis

ABSTRACT. Working memory (WM) is a cognitive process closely linked to decision-making, which makes it fundamental to any human activity. During the temporal subprocesses of WM (encoding, maintenance, and retrieval), there are transient fluctuations and, therefore, different characteristics specific to each subprocess. Understanding and deciphering these neural mechanisms remains a challenge in neuroscience. This research focuses on evaluating the characteristics of the (WM) signal at different temporal stages of cognitive processing to understand the contribution of each stage of cognitive load in deep learning (DL) models. To this end, a previously acquired EEG database from a visual working memory task with color feedback was used. Within this framework, the signal was analyzed by the different phases of the task, exploring the contribution of each phase. Two DL models widely used in EEG studies were employed, evaluated through cross-validation, accuracy metrics, and area under the ROC curve. The results show that certain temporal stages of the EEG signal contain information relevant to characterizing working memory, without the need to analyze the entire EEG signal. The study addresses the need to predict cognitive performance in contexts where mental workload and decision-making are critical, such as in brain-computer interface (BCI) systems and neuroergonomic applications, allowing us to identify relevant temporal phases in WM and reducing

Juan Elías Vera Díaz (Instituto Nacional de Astrofísica Óptica y Electrónica (INAOE), Mexico)
Alejandro Antonio Torres García (Instituto Nacional de Astrofísica Óptica y Electrónica (INAOE), Mexico)
Humberto Pérez Espinosa (Instituto Nacional de Astrofísica Óptica y Electrónica (INAOE), Mexico)
Juan Alberto Ramírez Quintana (Tecnológico Nacional de México, Campus Chihuahua (ITCH), Mexico)
Anabel Camacho Ávila (Universidad Autónoma del Estado de Morelos, Mexico)

Burnout Syndrome Identification from Electrodermal Signals Using Machine Learning Algorithms

ABSTRACT. Burnout syndrome has emerged as a significant occupational mental health concern, particularly within healthcare and service-oriented professions. Timely identification remains challenging due to the absence of universally accepted diagnostic instruments and the inherent subjectivity of self-report psychometric measures. This study investigates the feasibility of using electrodermal activity (EDA) signals combined with machine learning techniques to discriminate between individuals with and without burnout. Tonic and phasic EDA features were collected during resting baseline conditions (eyes open and eyes closed) and during the guided recall of neutral, negative, and positive autobiographical experiences. The sample comprised 17 participants meeting burnout criteria and 14 control subjects. The results demonstrate that machine learning models can reliably differentiate burnout and control participants, particularly during baseline measurements, reaching an accuracy of 0.81 ± 0.12 and negative emotional recall reaching an accuracy of 0.78±0.11. These findings support the potential of physiological signal analysis as an objective tool for burnout assessment.

Gaadha Jeevan (Healthcare Technology Innovation Centre (HTIC), IIT Madras, India)
Shyam Sundar K (Indian Institute of Technology Madras, India)
Mohammed Raiyyan V (Healthcare Technology Innovation Centre (HTIC), IIT Madras, India)
Kiran V Raj (Healthcare Technology Innovation Centre (HTIC), IIT Madras, India)
Jayaraj Joseph (Indian Institute of Technology Madras, India)

Automated Auscultatory for Calibration-free Blood Pressure Estimation: Comparative Analysis of Frequency and Morphology-Based Methods

PRESENTER: Kiran V Raj

ABSTRACT. Effective hypertension management depends on accurate and scalable blood pressure (BP) measurements. While automated oscillometric monitors are widely used, their reliance on population-specific calibration coefficients limits performance generalizability to physiologically diverse and special populations (e.g., pregnant individuals). Although the auscultatory method remains the clinical reference standard, robust automated auscultatory solutions remain scarce, creating a gap between physiological fidelity and scalable automation. This work presents a systematic comparative evaluation of four calibration-independent auscultatory BP estimation frameworks. Two categories of Korotkoff sound representations were investigated: frequency-domain methods—band-limited energy (BLE) and dominant frequency amplitude tracking (DFAT)—and morphology-driven template matching methods using Dynamic Time Warping (DTW) and correlation-based similarity analysis (CBSA). Additionally, two search strategies for estimating systolic blood pressure (SBP) and diastolic blood pressure (DBP) were evaluated: threshold-based point detection and aggregate-level piecewise linear fitting (PWLF). Performance was assessed on a publicly available dataset of 350 recordings collected under ANSI/AAMI standards. Template-matching approaches demonstrated stronger association with reference BP values (r = 0.86 ± 0.04) compared to frequency-domain methods (r = 0.75 ± 0.09). Among the evaluated frameworks, DTW combined with PWLF achieved the most consistent performance, with r > 0.85 for both SBP and DBP and estimation differences of 0.3 ± 7.5 mmHg (SBP) and −4.6 ± 5.7 mmHg (DBP). These results suggest that morphology-driven representations combined with aggregate-level search strategies provide a more robust and calibration-independent framework than frequency-based and point-to-point approaches for automated auscultatory BP monitoring.

Janko Jurdana (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)
Matija Markulin (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)
Luka Matijević (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)
Luka Šiktar (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)
Domagoj Vlah (Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia, Croatia)
Filip Šuligoj (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)
Bojan Šekoranja (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)
Marko Švaco (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia, Croatia)

Reducing Peripheral False Positives in Prostate Micro-Ultrasound Segmentation Using Slice-Level Gating

ABSTRACT. Accurate prostate segmentation in micro-ultrasound images is crucial for navigation-assisted prostate biopsy proce- dures. State-of-the-art transformer-based models achieve strong accuracy in the central prostate slices but often produce false positive predictions at peripheral slices, degrading anatomical consistency. We propose a hybrid U-Net-Vision Transformer ar- chitecture augmented with an area weighted loss to address slice- level imbalance, and a classification-guided gating mechanism based on an asymmetric attention design for the class token. The classification branch estimates slice-level prostate presence and suppresses anatomically implausible segmentations at inference time. Evaluated on a public micro-ultrasound benchmark, the proposed method significantly reduces false positive rates while preserving Dice performance.

Saturnino Luz (Usher Institute, College of Medicine and Veterinary Medicine, The University of Edinburgh, UK)
Sofia de la Fuente Garcia (Usher Institute, College of Medicine and Veterinary Medicine, The University of Edinburgh, UK)
Eric Deng (Centre for Medical Informatics, Usher Institute, The Unievrsity of Edinburgh, UK)
Fasih Haider (Centre for Medical Informatics, Usher Institute, The University of Edinburgh, UK)
Masood Masoodian (Aalto University, Finland)

Modelling the effects of intangible cultural heritage experiences on mental well-being using physiological sensor data and speech signals

ABSTRACT. Intangible Cultural Heritage (ICH) experiences have been promoted for the positive health effects they may have on their audiences. As part of an investigation into such effects we have been developing and testing methods of assessing the mental well-being of people experiencing ICH through eXtended Reality (XR) installations that present immersive audio-visual content related to four case study sites in Europe. These methods are based on gathering physiological and speech signals by means of mobile devices and building predictive machine learning models based on those signals. We have collected a range of physiological signals, including heart rate, electrodermal activity, and photoplethysmography using a research-grade mobile device, as well as spontaneous speech. These signals are being analysed in relation to validated well-being scales (a visual analogue scale and the positive and negative affect schedule), and a biomarker of stress (salivary cortisol and cortisone levels). Initial results show a significant increase in self-reported well-being and a significant decrease in negative affect following the XR-mediated ICH session compared to baseline (p<0.05), while other subscales showed no significant change. Exploratory analyses further suggest associations between heart rate and well-being on the intervention day, and between speech-derived components and affective ratings, supporting the potential of physiological and speech signals for future predictive modelling.

Ilaria Lazzaro (University Magna Graecia of Catanzaro, Italy, Italy)

Integrating Multilayer and Bayesian Networks for Modelling and Analysis of Complex Biomedical Data

ABSTRACT. The analysis of complex biomedical systems requires methodological frameworks capable of modelling heterogeneous data, multiscale interactions, and uncertainty. Graph-based approaches provide a flexible paradigm for representing structural and probabilistic dependencies underlying biological and clinical phenomena. Among these, Multilayer Networks (MLNs) and Bayesian Networks (BNs) offer complementary perspectives: MLNs support the integration of multimodal data across multiple scales, while BNs enable probabilistic inference and predictive modelling under uncertainty.

This work develops and validates graph-theoretical methodologies based on MLNs and BNs for the analysis of complex biomedical data such as neuroimaging, biosignal (e.g., EEG), and clinical datasets. MLNs models are applied to multimodal MRI data to investigate connectivity alterations in multiple sclerosis, revealing increased modularity and reduced network integration in patients compared to healthy controls. EEG multilayer analysis further identifies distinct cross-frequency connectivity patterns in epileptic, psychogenic non-epileptic seizures (PNES), and healthy subjects, demonstrating the capacity of MLNs to capture interactions not accessible through single-layer representations.

Finally, BNs are employed for heart disease prediction, with particular emphasis on synthetic data generation to mitigate data scarcity. The integration of real and synthetic samples improves predictive performance, increasing accuracy from 73\% to 78\% in external validation datasets, while highlighting the dependence of generative augmentation on data completeness and representativeness.

Overall, the results demonstrate that multilayer and Bayesian network approaches provide interpretable, quantitatively validated tools for the analysis of network structure and topology and probabilistic modelling in complex biomedical domains.

Lorella Bottino (Magna Graecia University, Catanzaro, Italy)

Improving machine learning models for clinical datasets through Active Learning and Data Augmentation

ABSTRACT. The rapid advancement of Machine Learning (ML) technologies has revolutionized healthcare, particularly in the domains of disease prediction and diagnosis. Nevertheless, some data-related issues, primarily annotated data scarcity or lowquality labels, and class imbalance in the dataset, can compromise the effectiveness of the learning process. The goal is to implement an Active Learning (AL) approach to mitigate data scarcity and a Data Augmentation strategy to resolve dataset skewness.

Francisco Luna-Perejón (Universidad de Sevilla, Spain)
Luis Muñoz-Saavedra (Universidad de Sevilla, Spain)
Javier Civit-Masot (Universidad de Sevilla, Spain)
Emilio García-Cabrera (Universidad de Sevilla, Spain)
Antón Civit (Universidad de Sevilla, Spain)
Manuel Domínguez-Morales (Universidad de Sevilla, Spain)
José Luis Sevillano-Ramos (Universidad de Sevilla, Spain)
Manuel Rivas-Pérez (Universidad de Sevilla, Spain)
Miguel Ángel Pertegal-Vega (Universidad de Sevilla, Spain)
Lourdes Miró-Amarante (Universidad de Sevilla, Spain)

The ADICVIDEO Project: Multimodal Assessment of Emotional, Sleep, and Fatigue Markers in Videogame Addiction

ABSTRACT. Videogame addiction (VA), recognized as Gaming Disorder in ICD-11, is currently assessed mainly through self-report questionnaires. The ADICVIDEO project investigates whether objective physiological and behavioral markers can complement traditional psychometric assessment.

The project analyzes the relationship between gaming addiction levels and emotional state, sleep quality, and fatigue in emerging adults (18–30 years). A multimodal experimental framework integrates physiological sensing (EDA, BVP), eye tracking, and facial expression analysis across synchronized gaming and sleep sessions.

The transversal phase (n=440) identified distinct player profiles and behavioral patterns, while experimental data collection (n=101) supports the development of Machine Learning models for emotion and fatigue classification. Preliminary results show high accuracy in emotion recognition using multimodal physiological and facial features.

ADICVIDEO aims to advance objective digital biomarkers for behavioral addiction risk assessment and to support the development of AI-based monitoring tools for early detection and prevention.

Alexander Vicol (University of Toronto, Canada)
Gianella Bejar-Alvarez (University of Toronto, Canada)
Steve Mann (University of Toronto, Canada)

State-of-Sleep in Hyperbaric Oxygen Therapy: Wearable EEG Before, During, and After Sleep in HBOT

ABSTRACT. Sleep inside a hyperbaric oxygen therapy (HBOT) chamber presents a unique confluence of hyperoxia, sensory isolation, and postural confinement. Understanding how neural oscillatory dynamics shift before, during, and after sleep in this environment is essential for evaluating HBOT as a sleeprelated intervention. We report a single participant feasibility study spanning two sessions inside an OxyNova portable chamber operated at 1.4 ATA. Continuous 4-channel EEG was recorded at 256 Hz using Muse-class wearable headbands, and each recording was segmented into three phases: Before (pre-sleep settling), Sleep (sustained NREM), and After (post-sleep waking). During the Before phase, delta-band power comprised 62–71% of total spectral power as participants transitioned from wakefulness to drowsiness. During Sleep, the spectral profile shifted further: the theta/beta ratio (TBR) decreased to 0.28–1.41, sigma-band activity (12–15 Hz) rose to 2.4–6.9%, and delta/theta ratio (DTR) remained elevated (> 2.8), indicating sustained NREM with spindle activity. In the After phase, delta power partially recovered to 27–56%, accompanied by increased TBR relative to Sleep, suggesting a gradual return toward wakefulness. A companion sleep-staging application corroborated the phase boundaries, showing 40 min of scored sleep (59% light, 41% deep) in Session 1. Signal-quality analysis revealed declining electrode contact during sleep, an important practical limitation for overnight wearable recordings. These results demonstrate feasibility and provide the first before–during–after spectral comparison of wearable EEG during sleep in HBOT.

11:00-12:30 Session 4A: Explainable AI for Clinical Decision-Making

Chair:

Eirini Schizas (CYENS Centre of Excellence, Cyprus)

Location: Panorama

11:00	Charithea Stylianides (CYENS Center of Excellence, Cyprus) Andria Nicolaou (Department of Computer Science, University of Cyprus, Cyprus) Anna Vavlitou (State Health Services Organisation, Cyprus) Lakis Palazis (State Health Services Organisation, Cyprus) Marios S. Pattichis (Department of Electrical and Computer Engineering, University of New Mexico, United States) Constantinos S. Pattichis (CYENS Center of Excellence, Cyprus) Andreas S. Panayides (CYENS Center of Excellence, Cyprus) Interpretable Machine Learning for Early Sepsis Prediction ABSTRACT. Sepsis disease causes a rising number of morbidity and mortality cases in Intensive Care Units (ICUs). Despite advances in diagnostic biomarkers and scoring systems, as recommended by the World Health Organisation, there is a strong need for early diagnosis and timely interventions. In this study, we leverage Electronic Health Records (EHRs) from the Medical Information Mart for Intensive Care (MIMIC)-IV dataset and propose a machine learning (ML) pipeline that supports explainability. Using data from 5,285 patients and 45 clinically relevant features, we develop an Ensemble of a Gradient Boosting and a Long-Short Term Memory Network to predict sepsis 12 hours in advance. The resulting Area Under the Curve (AUC) is 0.91 with sensitivity 0.91 at specificity 0.74. A Decision Curve Analysis shows strong clinical utility with maximum net benefit 0.45. We use the TE2Rules explainability library for interpretability, achieving an overall fidelity score of 94% with just 44 rules for a positive prediction and 51 rules for negative prediction. By further applying argumentation theory Ensemble sensitivity reaches 92%. This is the first study to investigate sepsis prediction in ICU patients using MIMIC-IV, achieving a significantly higher AUC and at longer lead time, and the first to use rule-based explainable AI and argumentation theory on this problem. The pipeline is proven promising for early sepsis diagnosis and intervention, ultimately reducing mortality rates and healthcare costs.
11:15	Charithea Stylianides (CYENS Center of Excellence, Cyprus) Evropi Toulkeridou (CYENS Center of Excellence, Cyprus) Panagiota Kosmidou (Mediterranean Hospital of Cyprus, Cyprus) Andreas S. Panayides (CYENS Center of Excellence, Cyprus) Marios S. Pattichis (Department of Electrical and Computer Engineering, University of New Mexico, United States) Explainable Complication Status Prediction after Tracheostomy Procedure ABSTRACT. Tracheostomy is a common procedure in intensive care units (ICU) and is associated with substantial morbidity, mortality, and healthcare costs. Despite the high incidence of post-tracheostomy complications, current risk assessment relies largely on clinician judgment, with no standardized predictive tools. In this study, we develop and evaluate machine learning (ML) models to predict tracheostomy-related complications by hospital discharge using early post-procedural data. We used the Medical Information Mart for Intensive Care (MIMIC)-IV database and 16 unique demographic, diagnosis, and 12-hour post-tracheostomy physiological measurements from 581 adult ICU patients. We evaluated Random Forest, Extreme Gradient Boosting (XGB), K-Nearest Neighbours (KNN), and Multilayer Perceptron (MLP) models across 6 temporal data representations. Interpretability was supported with SHapley Additive exPlanations (SHAP). The XGB model using combined aggregated and flattened temporal features achieved the best performance (AUC 0.79, AUPRC 0.90, Brier score 0.19). Key predictors included ventilator-associated pneumonia, infection-related diagnoses, and peak airway pressure variability. Extending the temporal window to 24 hours did not improve performance. This work represents the first ML-based approach for predicting post-tracheostomy complications for clinical decision support, improved patient experience, and lower healthcare expenditures.
11:30	Fati Oiza Salami (Laboratoire Images, Signaux et Systémes Intelligents (LiSSi), Université Paris Est Créteil (UPEC), France, France) Youssef Mourchid (CESI LINEACT Laboratory, UR 7527, 21000, Dijon, France, France) Muhammad Muzammel (Laboratoire Images, Signaux et Systémes Intelligents (LiSSi), Université Paris Est Créteil (UPEC), France, France) Abdulsalam Ibrahim Enesi (Department of Paediatrics, Kogi State Specialist Hospital, Lokoja, Nigeria, Nigeria) Suleiman Omeiza Hassan (Department of Paediatrics, Kogi State Specialist Hospital, Lokoja, Nigeria, Nigeria) Alice Othmani (Laboratoire Images, Signaux et Systémes Intelligents (LiSSi), Université Paris Est Créteil (UPEC), France, France) A Bi-modal Knowledge Distillation Framework for Explainable Neonatal Jaundice Diagnosis PRESENTER: Fati Oiza Salami ABSTRACT. Neonatal jaundice is a prevalent condition whose delayed diagnosis can lead to severe neurological complications. Conventional diagnostic approaches such as total serum bilirubin (TSB) testing are invasive and resource intensive, while the existing non-invasive approaches often suffer from limited generalization and interpretability. To address these challenges, this study proposes a lightweight bi-modal framework (BiNJD) trained via knowledge distillation for accurate, explainable, and efficient neonatal jaundice diagnosis. The approach integrates spatial visual representation with language-based clinical semantics. A Vision Mamba visual teacher and an LLM + CLIP-based textual teacher jointly transfer complementary knowledge to a lightweight ResNet-50 student through multi-level cross-modal knowledge distillation, enabling rich feature learning without increased inference complexity. Model explainability is achieved using Grad-CAM visualizations and vision-language alignment to reinforce clinically meaningful reasoning. The proposed approach achieves state-of-the-art performance on the NJN dataset (97.37% accuracy, 98.25% F1-score) and demonstrates strong cross-dataset generalization on the externally curated JaundiSet-NG dataset, which contains darker skin tones from African populations. Computational evaluation shows significant reductions in parameters, FLOPS, and inference latency, highlighting the framework’s suitability for real-world deployment in resource-constrained clinical environments.
11:45	Syed Muhammad Hamza Zaidi (Otto von Guericke University Magdeburg, Germany) Myra Spiliopoulou (Otto von Guericke University Magdeburg, Germany) Lisa Klemm (Leibniz Institute for Neurobiology Magdeburg, Germany) Benjamin Noack (Otto von Guericke University Magdeburg, Germany) Explainable Finger Kinematics Decoding with Temporal Graph Neural Networks ABSTRACT. Wearable kinematic sensing enables fine-grained monitoring of hand function, which is important for rehabilitation, assistive technology, and clinical assessment of motor disorders. The ultimate goal is to use it as a digital twin in order to identify specific motor disorders, and monitor their evolution. However, decoding exoskeletal data glove recordings is challenging because the signals are high dimensional, anatomically constrained, and informative mainly through coordinated finger joint dynamics over time. Moreover, the resulting classifiers should be interpretable at the finger joint level. We introduce (i) a \emph{Physio-Digital Temporal Graph (PDTG)} that models hand anatomy and supports correlation derived functional connectivity overlays for interpretation, and (ii) an explainable Temporal Graph Neural Network (TGNN) that performs message passing using Graph Convolutional Network (GCN) at each time step and aggregates temporal dynamics with a bidirectional Long Short Term Memory (bi-LSTM) for window level classification of six hand movement tasks that were performed by 17 younger adults and 17 older adults using an exoskeleton data glove. Raw joint-angle trials are transformed into angle, velocity, and acceleration features, segmented into sliding windows which we evaluate using 5-fold subject-wise cross-validation to prevent subject-leakage. Our TGNN model reached 92.5% ± 2% macro-F1 score for window-level task classification, and for interpretability, we used integrated gradients to produce finger joint level attribution maps and compared attribution patterns between younger and older adults on fine manipulation tasks, highlighting task dependent differences in finger joint involvement. In general, the proposed framework provides an interpretable way to decode hand movements and enables group level comparisons among age groups based on hand topology.
12:00	Emmanuel C. Chukwu (TU Eindhoven, Netherlands) Rianne Schouten (Eindhoven University of Technology, Netherlands) Monique Tabak (University of Twente, Netherlands) Mykola Pechenizkiy (Eindhoven University of Technology, Netherlands) Adaptive Group-Based Counterfactual Explanations for Time-Series Rehabilitation Data ABSTRACT. Counterfactual explanations for multivariate time-series classifiers are often difficult to interpret in domains where experts reason in terms of semantic feature groups rather than individual channels. In rehabilitation movement analysis with multi-sensor inertial measurement units (IMUs), clinicians interpret motion through muscle-group and joint-segment abstractions; yet, most existing counterfactual methods operate at the channel level, producing scattered and biomechanically incoherent explanations. We propose a two-stage framework for group-based counterfactual generation in high-dimensional IMU data. We first show that Shapley-Adaptive (SA) group ranking preserves counterfactual validity but fails to enforce group-level sparsity, motivating the need for explicit group selection. We then introduce Learnable Gate (LG) methods, which incorporate trainable per-group relevance gates jointly optimized with perturbation masks. Experiments on the KneE-PAD rehabilitation dataset demonstrate that LG substantially improves modality-group sparsity compared to the channel-level M-CELS baseline while maintaining or improving validity, temporal smoothness, and generation efficiency. Exercise-specific analyses further show that group-structured counterfactuals yield concise, muscle-level corrective guidance aligned with clinical reasoning. Overall, the proposed framework enhances interpretability without sacrificing counterfactual quality, enabling more actionable explanations for rehabilitation movement analysis.
12:15	Yohan Gumiel (INCOR-USP, Brazil) Gustavo Cruz (INCOR-USP, Brazil) Carolina Montenegro (INCOR-USP, Brazil) Claudia Moro (INCOR-USP, Brazil) Ramon Moreno (INCOR-USP, Brazil) Marina Rebelo (INCOR-USP, Brazil) José Krieger (INCOR-USP, Brazil) Marco Gutierrez (INCOR-USP, Brazil) Automatic Generation of Longitudinal Cardiology Timelines from Clinical Narratives Using LLM-Based Extraction and Interactive Visualization PRESENTER: Yohan Gumiel ABSTRACT. Longitudinal cardiology care requires integrating information scattered across multiple visits, yet key details remain embedded exclusively in free-text notes. We present an end-to- end pipeline that transforms heterogeneous visit-level documents into a longitudinal timeline by (i) extracting structured events from individual encounters using large language models and (ii) consolidating semantically equivalent entities across visits into unified clinical trajectories. The resulting representation captures diagnoses, symptoms, medications, including dose/regimen changes, and vital signs, enriched with clinically relevant quali- fiers such as negation and uncertainty. The structured timeline is exported as machine-readable JSON and presented through an interactive HTML dashboard with synchronized visual panels. This interface supports rapid temporal exploration, episode-focused review, and integrated inspection of conditions, therapies, and physiological signals.

11:00-12:30 Session 4B: Breast Imaging, Mammography, and Oncological AI

Chairs:

Kosmia Loizidou (KIOS Research and Innovation Center of Excellence, University of Cyprus, Cyprus)
Costas Pitris (University of Cyprus, Cyprus)

Location: Atrium A

11:00	Darina Narova (University of Zilina, Slovakia) Ivan Cimrák (University of Zilina, Slovakia) Self-Supervised Foundation Models for Mammography: A Survey of Architectures, Benchmarks, and Clinical Translation Challenges PRESENTER: Darina Narova ABSTRACT. Two large prospective trials—MASAI (n=105,934, Sweden) and PRAIM (n=461,818, Germany)—have demonstrated that AI-assisted screening significantly improves cancer detection, reporting increases of 29% and 17.6% respectively. Despite these gains, the reduction in interval cancers remains modest (1.55 vs. 1.76 per 1,000 screens in MASAI), and aggressive subtypes such as triple-negative breast cancer remain largely undetected. While factors such as cancer growth kinetics, imaging physics, and screening intervals contribute to this residual burden, this survey argues that one plausible limiting factor is architectural: supervised convolutional systems are trained on discrete radiological labels that exclude the pre-malignant tissue signals responsible for interval cancers. Self-supervised learning (SSL) avoids this by deriving training signal from unlabelled image structure. Four classes of self-supervised learning techniques applicable to mammography are compared—contrastive learning (SimCLR/MoCo), masked autoencoders (MAE), self-distillation (DINO), and vision-language pre-training (CLIP)—along three axes: training mechanism, data requirements, and resistance to scanner-induced domain shift. We review domain-specific foundation models including MammoDINO, Mammo-CLIP, MAMA, Mammo-FM, and VersaMammo, analyse how each addresses the generalisation gap, discuss interpretability requirements for clinical deployment, and outline the research directions most likely to reduce the residual interval cancer burden.
11:15	Linda Blahová (Faculty of Management Science and Informatics, University of Žilina, Slovakia) Jozef Kostolný (Faculty of Management Science and Informatics, University of Žilina, Slovakia) Ivan Cimrák (Faculty of Management Science and Informatics, University of Žilina, Slovakia) Mammography BI RADS Reformulation: From Raw Categories to Low/High Risk and Soft-Label Modelling PRESENTER: Linda Blahová ABSTRACT. Calcification findings in mammography are small, subtle, and often lack sufficient contextual cues for confident assignment of fine grained BI RADS categories. As a result, models trained to predict the four discrete BI RADS classes (2, 3, 4, 5) from calcification only evidence tend to exhibit limited stability and modest accuracy. This work shifts attention from model architecture to label formulation by reframing the task as a clinically aligned BI RADS risk prediction problem. Instead of predicting raw BI RADS classes, we group them into Low (2–3) versus High (4–5) risk and compare Hard labels to Soft labels near the 3/4 boundary. Using the same classifier and unified cross database calcification dataset, the raw four class setup reaches BACC 69.5%. The Low/High reformulation with Hard labels improves to BACC 81.6% / AUC 88.1%. The soft label variant reaches BACC 80.4% / AUC 87.2%, with a small drop in sensitivity but higher specificity and more cautious probabilities near the edge. These results show that a simple, clinically aligned risk reformulation makes analysis of calcification patterns more stable and practical without modifying the model architecture.
11:30	Kosmia Loizidou (KIOS Research and Innovation Center of Excellence, University of Cyprus, Cyprus) Eleni Orphanidou Vlachou (Eleni Orphanidou Vlachou IEPE, Cyprus, Cyprus) Anneza Yiallourou (Medical School, University of Cyprus, Breast Unit, Nicosia General Hospital, State Health Services Organisation, Cyprus) Christos Nikolaou (Limassol General Hospital, State Health Services Organisation, Cyprus, Cyprus) Costas Pitris (KIOS Research and Innovation Center of Excellence, University of Cyprus, Cyprus) Breast Cancer Subtyping using Digital Mammograms and Feature-based Machine Learning: Temporal Subtraction vs. Single-Mammogram Analysis ABSTRACT. Breast Cancer (BC) remains a leading cause of morbidity and mortality among women worldwide. The clinical management and prognosis depend heavily on the molecular subtype of BC, which is determined by biopsy and histopathological analysis. Despite their diagnostic value, these procedures are invasive, costly, and can delay clinical decision-making. This study investigates the added value of Temporal Subtraction (TS) compared to Single-Mammogram (SM) analysis for automatic detection and subtyping of BC using digital mammograms. A newly collected data set of 164 sequential temporally digital mammograms, annotated by two expert radiologists, was used. Two parallel machine learning pipelines were developed. The TS pipeline incorporated image pre-processing, registration, and temporal subtraction prior to lesion segmentation, feature extraction, selection, and classification. The SM pipeline followed the same workflow, excluding image registration and temporal subtraction. Classification was evaluated for luminal A vs. non-luminal A subtypes. The TS-based approach achieved an accuracy of 91.2%, outperforming the SM-based analysis, which reached 87.6%, with the improvement being statistically significant (p < 0.05). The results demonstrate that exploiting TS provides complementary diagnostic information, enhancing subtyping performance. The proposed framework highlights the potential of TS as a fast, non-invasive decision-support tool that can reduce reliance on biopsies.
11:45	João Marcos Moço Giraldi (VISIA Laboratory, Federal University of Santa Catarina (UFSC), Brazil) Vinicius Magnus Barbosa (VISIA Laboratory, Federal University of Santa Catarina (UFSC), Brazil) Thaynara Tessaline Mitie Sei Soares (Postgraduate Program in Information Technology and Communication, UFSC, Brazil) Ana Karol Spricigo Laurindo (Postgraduate Program in Information Technology and Communication, UFSC, Brazil) Arthur Gentili (Department of Pathology, University Hospital (HU/UFSC), Federal University of Santa Catarina (UFSC), Brazil) Marcelo Berejuck (Department of Informatics and Statistics (INE/UFSC), Federal University of Santa Catarina (UFSC), Brazil) Aldo von Wangenheim (Brazilian Institute for Digital Convergence (INCoD), Brazil) Antonio Carlos Sobieranski (VISIA Laboratory, Federal University of Santa Catarina (UFSC), Brazil) Classification of Breast Cancer Patterns in Immunohistochemistry (IHC) Images based on Multi-Class Targets ABSTRACT. Microscopic analysis of biopsy slides, particularly immunohistochemistry (IHC), plays a critical role in cancer diagnosis. This process still largely relies on the visual assessment and cell counting, which are time-consuming and subject to error and variability. Recent advances in Deep Learning, especially convolutional neural networks (CNNs), have enabled the devel- opment of interesting architectures for cell detection, counting, and classification, providing robustness and standardization for histopathological analysis. In this paper we present a comparative study of CNN architectures applied to the analysis of microscopic biopsy images in breast cancer pathology. The proposed approach considers object detection (OB) and object classification (OC) paradigms. For OB, a Faster R-CNN framework with a ResNet-50 backbone and a Feature Pyramid Network (FPN) was considered. For OC, architectures such as DenseNet-121 were evaluated due to their dense connectivity, which promotes efficient fea- ture reuse and improved representation of fine-grained textural patterns. The experimental results were applied in a prepared environment, where dataset partitioning, data augmentation, normalization, and evaluation were considered, as well as the objective evaluation metrics mean Average Precision (mAP), precision, recall, F1-score, and cell counting error. The obtained experimental results indicate the RESNET as the most suitable approach for multi-class classification with 82.89% precision, suggesting it as a promising solution for diagnosis in microscopic biopsy analysis.
12:00	Thiago Meneses Lopes (UFSCar, Brazil) Gabriel Souto Ferrante (UFSCar, Brazil) Gabriel Santos Martin Dias (Eldorado Research Institute, Brazil) Pedro Henrique Bugatti (UFSCar, Brazil) Cid Santos (Eldorado Research Institute, Brazil) Priscila Tiemi Maeda Saito (UFSCar, Brazil) Segmentation-Driven Background Skin Extraction for Robust Skin Tone Estimation in Dermatological Images PRESENTER: Thiago Meneses Lopes ABSTRACT. Performance disparities across skin tones remain a critical challenge in dermatological artificial intelligence, largely driven by demographic imbalance in publicly available datasets. Reliable skin tone estimation is therefore essential for analyzing and mitigating potential bias. However, accurate computation of the Individual Typology Angle (ITA), a widely used metric for skin tone characterization, depends on correctly isolating healthy background skin in lesion images. This work proposes a segmentation-driven framework for robust skin tone estimation by systematically evaluating four background skin extraction strategies: Center Crop, Structured Patches, YOLO-based lesion exclusion, and SAM-based pixel-level segmentation. Experiments conducted on the HAM10000 and PAD-UFES-20 datasets analyze how these strategies influence ITA distributions, Fitzpatrick skin type categorization, and dataset skin tone composition. Results show that background extraction significantly affects tone estimation stability and subgroup representation. While automatic methods provide consistent estimates for lighter tones, darker tones remain challenging due to dataset imbalance and intrinsic overlap in ITA values under heterogeneous acquisition conditions. To support reproducible research, we also release derived skin tone annotations and curated background skin patches for the evaluated datasets, enabling further studies on fairness and bias in dermatological AI systems.
12:15	Daniel Lozano Gutiérrez (Tecnológico de Monterrey, Mexico) Juan Salvador Toledo Rios (Tecnológico de Monterrey, Mexico) Diana Sofia Milagros Rosales Gurmendi (Tecnológico de Monterrey, Mexico) José Gerardo Tamez Peña (Tecnológico de Monterrey, Mexico) MammoTwin: An Open Source Digital Twin Framework for Protocol Optimization via MRI-to-Mammography Synthesis PRESENTER: Daniel Lozano Gutiérrez ABSTRACT. Mammography screening inherently involves cumulative exposure to ionizing radiation. Digital twin simulation offers a promising pathway for protocol optimization without biological risk, yet it requires rigorous physical validation to ensure clinical relevance. This work presents MammoTwin, an open-source software framework designed to generate synthetic mammograms from Magnetic Resonance Imaging (MRI), validated through standard clinical Quality Assurance (QA) metrics. The proposed system integrates a comprehensive five-phase pipeline: data acquisition, hybrid AI-driven segmentation utilizing nnU-Net and BreastSegNet, biomechanical compression modeling with exact volume preservation, physics-based X-ray simulation based on NIST XCOM cross-sections at 28 keV, and automated QA validation. Quantitative evaluation on a diverse cohort of thirteen test cases demonstrated the system's robustness across varying anatomies. The generated synthetic projections exhibited a maximum dynamic range of 64.0 dB and an effective depth ranging between 5.8 and 10.6 bits, faithfully capturing biological variability. The cohort achieved an average Signal-to-Noise Ratio (SNR) of 88.6, confirming diagnostic quality across both adipose-dominant and high-density breast tissues. Ultimately, MammoTwin is a viable tool to estimate the patient specific radiation dose, allowing for the optimization of the mammogram protocol that provides high quality images at low radiation levels.

11:00-12:30 Session 4C: Cognitive Signal Analysis

Chair:

Antonis Billis (Medical Physics and Digital Innovation Laboratory, Aristotle University of Thessaloniki, Thessaloniki, Greece, Greece)

Location: Atrium B

11:00	Theodora Gazea (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Ilias Machairas (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Konstantinos Mitsopoulos (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Vasiliki Fiska (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Panagiotis Kartsidis (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Panagiotis Bamidis (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Alkinoos Athanasiou (Medical Physics & Digital Innovation Laboratory, School of Medicine, Aristotle University of Thessaloniki (AUTH), Greece) Assessing the effects of xDAWN filtering, reduction of EEG channels and addition of EOG signals on the classification of movement versus rest PRESENTER: Theodora Gazea ABSTRACT. Movement-Related Cortical Potentials appear before movement execution/intention and have been proposed for the control of brain-computer interfaces. In this work, the detection accuracy of hand movement execution from time intervals where MRCPs are expected is investigated, using a publicly available dataset, three montages, and amplitude features with or without xDAWN filtering. The montages were: 1) The 31-channel montage used by the dataset authors (HD), 2) a montage of 6 channels covering the motor area (LD), and a Hybrid setup comprising the LD and 4 electro-oculography channels. The classifier used was Shrinkage Linear Discriminant Analysis. The accuracy scores for the 2 x 3 factors were tested using rmANOVA, with sphericity corrections where necessary and Tukey's HSD post-hoc tests. Both the montage and its interaction with the xDAWN filtering had a significant effect on the achieved accuracies (p < 0.05), with the HD and Hybrid montage outperforming the 6-channel subset. We thus conclude that hand movement execution detection benefits from denser montages and electro-oculography information.
11:15	Alexander Vicol (University of Toronto, Canada) Gianella Bejar-Alvarez (University of Toronto, Canada) Sarah Hassaballa (University of Toronto, Canada) Shiyuan Luo (University of Toronto, Canada) Nancy Wong (University of Toronto, Canada) Steve Mann (University of Toronto, Canada) Motor–Cognitive Performance After Cold Exposure Therapy: Wearable EEG Mini-Golf Study ABSTRACT. Cold exposure triggers acute sympathetic activation and norepinephrine release, effects that may transiently enhance attention and motor precision after rewarming. We report a multi-session field study combining cold exposure (CE) with motor and cognitive assessments. In Session 1 (February 25), nine participants completed standardized mini-golf putting before and after a cold plunge; a subset wore Muse 2 EEG headbands (4-channel, 256 Hz) during putting before cold exposure(n = 3, 101 epochs) and after (n = 4, 110 epochs). Mini-golf scores improved in five of seven scored participants (+18.4% group mean). EEG theta/beta ratio (TBR) decreased by 12.0%, and frontal alpha asymmetry (FAA) shifted from +0.18 to +2.46, consistent with heightened alertness and approach motivation. In follow-up sessions, participants completed cognitive batteries before and after cold exposure with concurrent EEG. Across all sessions, three participants with pre/post cognitive data maintained or improved PASAT accuracy; one improved from 7/10 to 10/10 and reduced serial subtraction time by 30%. Blood pressure decreased after CE in all measured participants. These findings suggest that a single cold plunge may acutely enhance focused attention and motor–cognitive performance.
11:30	Arjan Mahmuod (Oslo Metropolitan University, Norway) Adrian Rod Hammerstad (Oslo Metropolitan University, Norway) Muzaffar Yousef (Oslo Metropolitan University, Norway) Yngve Sebastian Heill (Oslo Metropolitan University, Norway) Jonas L. Isaksen (University of Copenhagen, Denmark) Jørgen K. Kanters (University of Copenhagen, Denmark) Pål Halvorsen (SimulaMet, Norway) Vajira Thambawita (SimulaMet, Norway) Sampling Matters: The Effect of ECG Frequency on Deep Learning-Based Atrial Fibrillation Detection ABSTRACT. Deep learning models for atrial fibrillation (AF) detection are increasingly trained on heterogeneous electrocardiogram (ECG) datasets with varying sampling frequencies, yet the specific consequences of these discrepancies on model performance, calibration, and robustness remain insufficiently characterized. To address this, we conducted a systematic benchmark using 12-lead, 10-second recordings from the PTB-XL dataset, resampled to target frequencies of 62, 100, 250, and 500~Hz, to evaluate a standard 1-D Convolutional Neural Network (CNN) and a hybrid CNN-Long Short-Term Memory (LSTM) architecture under a rigorous patient-safe cross-validation framework. Our analysis reveals that sampling frequency significantly impacts detection metrics in an architecture-dependent manner; the hybrid CNN-LSTM model demonstrated optimal performance and consistent calibration at intermediate frequencies (100–250 Hz), whereas the 1-D CNN baseline exhibited marked degradation in accuracy and sensitivity at 500 Hz, suggesting increased susceptibility to high-frequency noise. We conclude that ECG sampling frequency is a critical, underappreciated factor in arrhythmia detection, and future foundation models must explicitly control for temporal resolution to ensure clinical reliability and reproducibility.
11:45	Sharan Ram V S (Indian Institute of Technology Madras, India) Raj Kiran V (Healthcare Technology Innovation Centre (HTIC), IIT Madras, India) Nabeel P M (Healthcare Technology Innovation Centre (HTIC), IIT Madras, India) Mohanasankar Sivaprakasam (Indian Institute of Technology Madras, India) Jayaraj Joseph (Indian Institute of Technology Madras, India) Characterization of a Computerized Method for Non-invasive Measurement of Arterial Hyperelastic Properties: Potential for Decoding Vascular Ageing ABSTRACT. Arterial stiffness is a core marker of vascular ageing, and pulse wave velocity (PWV) is widely used for its non- invasive assessment. However, conventional beat-level PWV is pressure-confounded and does not directly characterise within- beat hyperelastic behavior. In addition, no established automated ultrasound framework currently exists for direct incremental PWV estimation in routine point-of-care workflows. We develop a computerised multichannel RF-ultrasound method that estimates two systolic fiducial wave speeds and defines incremental PWV as ∆PWV = PWV2 − PWV1. We also construct a controlled in silico test bed to characterise sensitivity to frame rate, RF sampling rate, SNR, and channel count. Results show that frame-rate reduction via dropped-frame reconstruction causes substantial timing error at low frame rates, followed by a saturation region beyond which additional frame- rate increase provides marginal benefit. RF sampling sweeps show progressively increasing error as sampling is reduced, consistent with degraded fiducial morphology and timing local- ization. SNR analysis identifies a usable operating band and demonstrates that multichannel regression is markedly more robust than two-channel estimation, especially in low-SNR con- ditions; under selected operating conditions, incremental-PWV error remains below expected physiological within-beat PWV change. These findings indicate that reliable incremental PWV estima- tion is feasible when temporal resolution, sampling fidelity, and synchronization constraints are jointly satisfied and multichannel fitting is used. Clinically, this supports automated ultrasound- based incremental PWV as a scalable tool for early vascular- ageing assessment and longitudinal risk monitoring
12:00	Darcy Murphy (University of Manchester, UK) Sabine N van der Veer (The University of Manchester, UK) William Dixon (The University of Manchester, UK) Sarah Mackie (University of Leeds, UK) Sara Muller (Keele University, UK) David Wong (University of Leeds, UK) Development of an automated pipeline for the digitisation of paper pain drawings ABSTRACT. A pain manikin is a digital or paper-based diagram of a human body which can be marked with the locations where an individual experiences pain. Automated digitisation of paper manikins has the potential to save time over manual annotation, allow deeper analysis of the data, and facilitate data sharing. Current methods for digitising paper pain drawings rely on drawings being created with a red marker pen, so are unsuitable for existing datasets. Our objective was to develop and perform an initial validation of an automated open-source digitisation pipeline for paper pain manikin drawings, suitable for use on existing datasets not collected using a specific pen or drawing method. We created an automated pipeline which aligned scanned images to a blank manikin template and isolated drawn marks from the manikin outline, generated pixel maps of the pain areas and identified which pre-defined pain regions were marked as painful. We also created a synthetic dataset to assist with reproducibility. We performed a descriptive analysis comparing the outputs of the pipeline with manual annotations from a human rater. Comparing the pipeline to a human rater on identification of pre-defined pain regions (n=44 regions per drawing) on a synthetic dataset (n=20 drawings) found that of ten regions where the pipeline disagreed with the human rater, the pipeline was correct six times. Manual inspection showed that the pixel maps were generally accurate but included non-painful areas when the drawn pain area had a convex shape. We developed an automated pipeline for the digitisation of paper pain drawings. The pipeline may reduce human error when identifying marks in predefined regions. Further work is needed to improve the shape of pixel maps for certain shapes of pain area and to validate the pipeline.

11:00-12:30 Session 4D: Smart Medical Device Platforms

Chair:

Volodymyr Chapman (University of Leeds, UK)

Location: Atrium C

11:00	Stamatios Orfanos (BioAssist S.A., Greece) Christos Panagopoulos (BioAssist S.A., Greece) Parisis Gallos (University of Piraeus, Greece) Andreas Menychtas (University of Piraeus, Greece) Michael Kalogeropoulos (Hygeia Hospita, Greece) Elisa Tosello (Fondazione Bruno Kessler, Italy) Alessandro Valentini (Fondazione Bruno Kessler, Italy) Ilias Maglogiannis (University of Piraeus, Greece) Fatigue and Burnout Management in Nursing Staff Using Wearable Data ABSTRACT. Healthcare professionals, particularly nurses, are exposed to demanding shift-based working conditions that contribute to cumulative fatigue, burnout, and increased risk of clinical errors. This paper presents a data-driven platform for fatigue-informed shift scheduling that integrates continuous highresolution wearable-derived physiological signals, self-reported assessments, and contextual workload information. A hybrid modelling framework combines rule-based occupational health logic with machine learning–based physiological stress estimation to compute a unified Stress Index reflecting both acute strain and cumulative fatigue. The resulting index is incorporated into a scheduling engine to support workload allocation that respects predefined stress thresholds. Early-stage deployment and testing in a clinical environment indicate stable system operation, reliable data acquisition, and promising initial results regarding the feasibility of continuous stress monitoring within routine scheduling workflows.
11:15	Varun Pandithurai (Indian Institute of Technology Madras, India) Preejith Sp (Healthcare Technology Innovation Centre (HTIC), IIT Madras, India) Mohanasankar Sivaprakasam (Indian Institute of Technology Madras, India) A Modular, SOLID based Hybrid Software Architecture for Medical Devices on Heterogeneous Edge Platforms ABSTRACT. Medical device software development faces persistent challenges in portability, maintainability, and scalability particularly in vision based systems where tightly coupled, hardware specific implementations dominate. Existing architectures bind imaging pipelines directly to underlying hardware, resulting in high porting costs and resistance to modular testing. This results in significant rework during platform transitions, conflicting with the modular, independently verifiable software design advocated by standards such as IEC 62304. This work presents a modular, hybrid software architecture based on SOLID principles for medical devices, combining a layered structural decomposition with a messaging layer agnostic, event-driven inter-layer communication model. Five independently operating, process isolated layers Hardware subsystems, Image Signal Processing, Database, GUI, and Business Logic are initialised through a Configuration Layer and communicate exclusively through a lightweight asynchronous message bus, enforcing low coupling and enabling runtime reconfigurability without service interruption. The architecture serves as a general architectural framework applicable across medical device software systems, validated through a vision based imaging pipeline on heterogeneous edge platforms. Runtime reconfiguration of the ISP pipeline topology is demonstrated without disruption to adjacent layers, and the GUI layer is deployed on an architecturally distinct platform with zero source code modification, validating process-level isolation and portability. The pipeline delivers 60 FPS with zero frame loss in standard operating mode across resource differentiated hardware configurations. Cross-system reuse was validated with four of five layers requiring zero modification for a clinically distinct second system.
11:30	Silouanos Chaldoupis (University of Cyprus, Cyprus) Eirini Schizas (CYENS Centre of Excellence, Cyprus) Andreas Aristidou (University of Cyprus, Cyprus) Mixed Reality in Electronic Health Records: User Requirements and Evaluation PRESENTER: Silouanos Chaldoupis ABSTRACT. Electronic Health Records (EHRs) are essential to contemporary healthcare but remain limited by fragmented interfaces, poor usability, and inadequate support for complex multimodal data. Mixed Reality (MR) offers new opportunities to address these challenges, yet its integration into clinical information systems is largely unexplored. This paper presents the design and evaluation of a prototype MR-enhanced EHR that enables physicians to visualize and interact with patient data through holographic interfaces. Using MR glasses, clinicians access a three-dimensional representation of the patient’s body augmented with demographics, medical history, and diagnostic data, including 3D-rendered CT and MRI scans. Clinical information is spatially organized and accessed via gaze and gesture, supporting intuitive exploration and collaboration. A two-phase user study assessed requirements, usability, and clinical relevance. Results show strong acceptance among medical students and selective but promising interest among physicians, particularly for imaging, surgical planning, and patient communication. Overall, the findings indicate that MR-based EHRs can reduce cognitive load, improve anatomical understanding, and enhance collaborative and patient-centered care.
11:45	Niv Shifrin (Ben-Gurion University of the Negev, Israel) Eldar Zosmanovich (Ben-Gurion University of the Negev, Israel) Laurence Lovat (University College London, UK) Robert Moskovitch (Ben-Gurion University of the Negev, Israel) Continuous Falls Prediction Among Care Home Residents ABSTRACT. Proactive fall prevention in care homes is a major healthcare challenge, currently limited by the difficulty of modeling complex real-world data. This study leverages a unique, large-scale datasbase collected via a mobile care monitoring application, capturing the daily living activities of over 140,000 residents in multiple UK care homes. To utilize these irregular and heterogeneous event streams effectively, we propose a continuous prediction framework that transforms raw records into symbolic time intervals using temporal abstraction and mines frequent Time Interval Related Patterns (TIRPs). The Fully Continuous Prediction Model (FCPM) models TIRPs that end with a fall, so in real time based on the unfolding patterns it estimates fall probability the completion of these patterns, meaning that the fall will occur. In this study we propose two enhancements for the FCPM, the use of three general temporal relations, and a supervised state abstraction method. A rigorous evaluation on a large real life database, shows that the use of the three temporal reletions perfoms significnatly better than Allen's seven relations, and the use of the TD4C abstraction performs better than EWD, EFD, and SAX. Finally, the FCPM performs better in comparison to the baseline models, including the deep learning sequence based models (LSTM-FCN, ResNet, TFT) and the feature-based classifiers (XGBoost, ROCKET).
12:00	Alexander Vicol (University of Toronto, Canada) Xueqi Yang (University of Toronto, Canada) Bingxuan Yang (University of Toronto, Canada) Steve Mann (University of Toronto, Canada) A Consumer EEG–Driven Word Keyboard for Assistive Communication ABSTRACT. devices to communicate. These systems typically depend on eye-gaze tracking or touch input, which can become unreliable as a user's condition progresses or environmental conditions vary. We present a low-cost, brain-controlled word selection keyboard, developed entirely with consumer-grade components: a Muse 2 electroencephalography (EEG) headband, a smartphone running the Mind Monitor application, and a laptop. The system streams raw four-channel EEG data over Open Sound Control (OSC) streaming and extracts 65 spectral features for every two-second window collected using an FFT-based signal processing pipeline. The input is classified into one of four classes (blink, look-left, look-right, and background) using a compact neural network. A Pygame-based visual keyboard inspired by TD-Snap, a commercial AAC interface, allows users to navigate a word grid through the lateral eye movements. Words are selected with deliberate blinks, which triggers both text-to-speech output and an HID keyboard emulation. We describe the end-to-end system architecture and methods developed for live deployment. We also discuss the practical challenges encountered during real-time operation, mainly the sampling-rate discrepancies between offline and online data, the required action synchronization with the 2 second windowing periods, and the inherent signal limitations of four-channel frontal/temporal EEG. The complete system is open-source and demonstrates the feasibility of using consumer-based BCI hardware as a method of communication for users experiencing motor impairments.
12:10	Elizabeth Vidal (Universidad Nacional de San Agustin de Arequipa, Peru) Eveling Castro-Gutierrez (Universidad Nacional de San Agustin de Arequipa, Peru) Luci Delgado-Barra (Universidad Nacional de San Agustin de Arequipa, Peru) Robert Arisaca-Mamani (Universidad Nacional de San Agustin de Arequipa, Peru) Emotion-Aware Multimodal Virtual Rehabilitation: Integrating EEG and Motor Performance for Adaptive Therapy Regulation ABSTRACT. Rehabilitation plays a fundamental role in promoting functional independence for individuals with motor impairments. Virtual rehabilitation systems have demonstrated benefits for engagement and motor performance; however, most existing systems adjust task difficulty solely on the basis of motor performance metrics, while neglecting emotional states that influence motivation and adherence. This study proposes an affective computing-based virtual rehabilitation approach that integrates electroencephalography-derived emotional metrics into real-time difficulty adjustment. The system combines hand tracking with a Leap Motion Controller and affective monitoring via the EMOTIV Insight brain-computer interface. Stress and interest indicators are continuously analyzed to dynamically adjust task difficulty by modulating an in-game agent's speed. A preliminary user experience evaluation using the User Experience Questionnaire was conducted with 8 participants, divided into control and experimental groups. The experimental group exhibited higher scores in perspicuity (1.75 vs 0.56) and stimulation (1.37 vs 0.87), suggesting that integrating affectively driven adaptation enhanced perceived clarity and motivational engagement. These preliminary results suggest that the emotion-aware version improved perceived perspicuity and stimulation compared to the task-oriented version, while maintaining stable efficiency and dependability. These findings suggest that incorporating affective metrics as primary adaptation variables enhances engagement and supports a more balanced challenge-skill balance in virtual rehabilitation environments.

11:00-12:30 Session 4E: Special Track on Synthetic Healthcare Data Generation and Clinical Decision Support 2: Unstructured Data Generation

Chair:

Dimitris Iakovidis (University of Thessaly, Greece)

Location: Megaron B

11:00	Tharaka Dilshan (University of Peradeniya, Sri Lanka) Nethmini Karunarathne (University of Peradeniya, Sri Lanka) Isuri Devindi (University of Maryland, United States) Mary M. Maleckar (Tulane University School of Medicine, United States) Jørgen K. Kanters (University of Copenhagen, Denmark) Roshan Ragel (University of Peradeniya, Sri Lanka) Isuru Nawinne (University of Peradeniya, Sri Lanka) Vajira Thambawita (SimulaMet, Norway) Diffusion-Based Counterfactual ECG Generation\\for Atrial Fibrillation Data Augmentation PRESENTER: Tharaka Dilshan ABSTRACT. Automated detection of atrial fibrillation using deep learning requires large, balanced ECG datasets, yet clinical data remains scarce, imbalanced, and constrained by privacy regulations. We present a diffusion-based data augmentation pipeline that generates synthetic ECG segments by transforming existing recordings into counterfactual waveforms of the opposing class. The pipeline uses a partial-noise conditional denoising process with classifier-free guidance, operating on single-lead ECG signals. A content-style disentangled UNet architecture separates class-invariant morphology from class-discriminative rhythm features. A multi-stage plausibility post validator enforces morphological and physiological constraints and verifies rhythm consistency criteria, retaining only waveforms that satisfy quality thresholds and outputting a filtered counterfactual dataset for classifier training. We evaluate the generated and filtered counterfactual data through a three-regime protocol using a ResNet-BiLSTM classifier: training on originals only, counterfactuals only, and an augmented mixture formed by combining original ECGs with filtered counterfactual ECGs. In all cases, models are evaluated on the same held-out test set consisting exclusively of original ECG recordings to ensure fair comparison. The augmented mixture achieves 95.05\% accuracy and 98.60\% AUROC, statistically equivalent to original-only training (95.63\% accuracy, $p_{\text{TOST}}=0.007$, $\Delta=\pm2\%$), while counterfactual-only training retains 89.9\% of original performance. Furthermore, none of the accepted counterfactuals are near-copies of training data (maximum correlation to the nearest original: 0.30), indicating that the generated signals are novel and privacy-preserving, and can support dataset sharing and class balancing.
11:15	Lucia Borrego (Hospital de la Santa Creu i Sant Pau, Spain) Vajira Thambawita (SimulaMet, Norway) Marco Ciuffreda (Hospital de la Santa Creu i Sant Pau, Spain) Inés del Val (Hospital de la Santa Creu i Sant Pau, Spain) Alejandro Dominguez (Hospital de la Santa Creu i Sant Pau, Spain) Josep Munuera (Hospital de la Santa Creu i Sant Pau, Spain) Anatomy-Preserving Latent Diffusion for Generation of Brain Segmentation Masks with Ischemic Infarct ABSTRACT. The scarcity of high-quality segmentation masks remains a major bottleneck for medical image analysis, particularly in non-contrast CT (NCCT) neuroimaging, where manual annotation is costly and variable. To address this limitation, we propose an anatomy-preserving generative framework for the unconditional synthesis of multi-class brain segmentation masks, including ischemic infarcts. The proposed approach combines a variational autoencoder trained exclusively on segmentation masks to learn an anatomical latent representation, with a diffusion model operating in this latent space to generate new samples from pure noise. At inference, synthetic masks are obtained by decoding denoised latent vectors through the frozen VAE decoder, with optional coarse control over lesion presence via a binary prompt. Qualitative results show that the generated masks preserve global brain anatomy, discrete tissue semantics, and realistic variability, while avoiding the structural artifacts commonly observed in pixel-space generative models. Overall, the proposed framework offers a simple and scalable solution for anatomy-aware mask generation in data-scarce medical imaging scenarios.
11:30	Pedro Sousa (FEUP ; INESC-TEC, Portugal) Tania Pereira (FEUP ; INESC TEC, Portugal) João Santinha (FMUL ; Fundação Champalimaud, Portugal) Hélder P. Oliveira (FCUP ; INESC TEC, Portugal) A Generative Pipeline for 3D Breast MRI: Diffusion Models Meet Super-Resolution GANs ABSTRACT. Breast cancer is, currently, both the most prevalent one of the deadliest cancers worldwide. Successful treatment typically involves early diagnosis through non-invasive imaging techniques like Magnetic Resonance Imaging (MRI). Right away, a discrepancy can be identified regarding patient positioning during the two procedures: MRI acquisition is performed in prone while surgery is done in supine. The breast being a highly deformable tissue, this dichotomy may hinder the process of surgical planning and increase the risk of invasive procedure. As such, there is a clear gap in literature regarding prone-to-supine biomechanical modelling of breast deformation, a task very much dependent on the existence of large high-quality 3D MRI datasets. This work focuses on creating and presenting a generative framework to address the scarcity of said data through a low-cost two stage pipeline which combines 3D diffusion models to produce diverse lower-resolution breast MRI volumes and super-resolution (SR) architectures to ensure that samples meet clinical grade quality. The proposed pipeline successfully generated realistic and diverse 3D samples that match the underlying distribution of a small but clinically realistic dataset (89 patients). Thus, it proves how, even in lower-resource settings, it can be feasible to increase the availability of 3D breast MRI enough that a prone-to-supine biomechanical model could be trained.
11:45	Madhura Edirisooriya (University of Peradeniya, Sri Lanka) Dasuni Kawya (University of Peradeniya, Sri Lanka) Ishan Kumarasinghe (University of Peradeniya, Sri Lanka) Isuri Devindi (University of Maryland, College Park, USA, United States) Mary M. Maleckar (Tulane University School of Medicine, USA, United States) Roshan Ragel (University of Peradeniya, Sri Lanka) Isuru Nawinne (University of Peradeniya, Sri Lanka) Vajira Thambawita (SimulaMet, Norway) Balancing Fidelity, Utility, and Privacy in Synthetic Cardiac MRI Generation: A Comparative Study ABSTRACT. Deep learning in cardiac MRI (CMR) is fundamentally constrained by both data scarcity and privacy regulations. This study systematically benchmarks three generative architectures-Denoising Diffusion Probabilistic Models (DDPM), Latent Diffusion Models (LDM), and Flow Matching (FM) for synthetic CMR generation. Utilizing a two-stage pipeline where anatomical masks condition image synthesis, we evaluate generated data across three critical axes: fidelity, utility, and privacy. Our results show that diffusion-based models, particularly DDPM, provide the most effective balance between downstream segmentation utility, image fidelity, and privacy preservation under limited-data conditions, while Flow Matching demonstrates promising privacy characteristics with slightly lower task-level performance. These findings quantify the trade-offs between cross-domain generalization and patient confidentiality, establishing a framework for safe and effective synthetic data augmentation in medical imaging.
12:00	Pedro A. Moreno-Sánchez (TECNALIA, Basque Research and Technology Alliance (BRTA), Spain) Itsaso Vitoria (TECNALIA, Basque Research and Technology Alliance (BRTA), Spain) Cristina López-Saratxaga (TECNALIA, Basque Research and Technology Alliance (BRTA), Spain) Leire Benito-Del-Valle (TECNALIA, Basque Research and Technology Alliance (BRTA), Spain) Adrian Galdran (TECNALIA, Basque Research and Technology Alliance (BRTA), Spain) Does synthetic image generation improve explainability? A Concept-Based Case Study in Histopathology ABSTRACT. Deep learning models in histopathology increasingly rely on synthetic data augmentation, yet its impact on model interpretability remains unexplored. This study investigates whether synthetic image generation enhances or degrades concept-based explainability in mitotic cell classification. Using the AMi-Br dataset to develop our concept-based approach, we compare a baseline classifier trained on real images against one augmented with synthetic data. We introduce a post-hoc multi-centroid framework to map latent embeddings to expert-defined biological phenotypes pre-annotated in the AMi-Br dataset, evaluating the semantic structure via novel metrics: Global Separability, Structure Purity, Concept Fracture, and Concept Relevance. Our analysis reveals a critical trade-off: synthetic augmentation acts as a powerful manifold regularizer, significantly improving local neighborhood purity and discriminative power for underrepresented classes. However, this comes at the cost of semantic coherence; specifically, we observe a significant increase in concept fracture that degrades the unity of the histopathological concepts. We conclude that while synthetic data boosts robustness, it fragments biological concepts, necessitating rigorous structural auditing with biological experts to ensure trustworthy clinical decision support.
12:15	Panagiota Gatoula (University of Thessaly, Dept of Computer Science and Biomedical Informatics, Greece) Grigoris Karypidis (University of Thessaly, Dept of Computer Science and Biomedical Informatics, Greece) Dimitris Iakovidis (University of Thessaly, Dept of Computer Science and Biomedical Informatics, Greece) EndoFSA: Endoscopic Few-Shot Image Generation via Rank-Constrained Parameter Adaptation ABSTRACT. Wireless Capsule Endoscopy produces large-scale gastrointestinal image data, yet pathological findings remain significantly underrepresented, limiting the generalization performance of deep-learning based abnormality detection systems. Synthetic data generation methods offer a practical solution to mitigate this imbalance. However, their training directly on scarce abnormal samples often results in instability, overfitting and structural distortions. Addressing these challenges requires controlled adaptation mechanisms that preserve anatomical priors, while enabling realistic pathological variation. This paper presents EndoFSA, a GAN-based model for Endoscopic Few-Shot image generation by Adaptation in WCE imaging. EndoFSA leverages a generator pretrained on abundant normal data and adapts it to abnormal domains using limited number of training samples through a rank-constrained parameter adaptation, where only a small number of pretrained weight components is updated. By restricting parameter updates to a low dimensional subspace and incorporating perceptual boundary regularization and cluster-wise diversity control, EndoFSA enables efficient model adaptation under limited data conditions and mitigates mode collapse, while preserving the anatomical priors learned from normal data. Importantly, EndoFSA operates without requiring pixel-level annotations, masks or bounding box supervision. Evaluation on publicly available WCE benchmark datasets spanning various abnormal categories demonstrates that EndoFSA generates abnormal images reproducing real lesions morphology. Moreover, in a downstream classification task, training an image classifier solely on synthetic abnormal images generated by EndoFSA yields performance comparable to that obtained with real images. These findings highlight the effectiveness of few-shot adaptation based on rank-constrained parameter updates for abnormal WCE image synthesis under scarce data conditions.

11:00-12:30 Session 4F: Special Track on Multimodal Artificial Intelligence in Healthcare 2: Medical Imaging

Chairs:

Michela Gravina (University of Naples Federico II, Italy)
Angel Mario Garcia-Pedrero (Universidad Politécnica de Madrid, Spain)

Location: Megaron C

11:00	Mariano Barone (University of Naples, Federico II, Italy) Francesco Di Serio (University of Naples, Federico II, Italy) Giuseppe Riccio (University of Naples, Federico II, Italy) Antonio Romano (University of Naples, Federico II, Italy) Marco Postiglione (Northwestern University, United States) Antonino Ferraro (Pegaso University, Italy) Vincenzo Moscato (University of Naples, Federico II, Italy) Brain3D: Brain Report Automation via Inflated Vision Transformers in 3D ABSTRACT. Current medical vision-language models (VLMs) process volumetric brain MRI using 2D slice-based approximations, fragmenting the spatial context required for accurate neuroradiological interpretation. We developed Brain3D, a staged vision-language framework for automated radiology report generation from 3D brain tumor MRI. Our approach inflates a pretrained 2D medical encoder into a native 3D architecture and progressively aligns it with a causal language model through three stages: contrastive grounding, supervised projector warmup, and LoRA-based linguistic specialization. Unlike generalist 3D medical VLMs, Brain3D is tailored to neuroradiology, where hemispheric laterality, tumor infiltration patterns, and anatomical localization are critical. Evaluated on 468 subjects (BraTS pathological cases plus healthy controls), our model achieves a Clinical Pathology F1 of 0.951 versus 0.413 for a strong 2D baseline while maintaining perfect specificity on healthy scans. The staged alignment proves essential: contrastive grounding establishes visual-textual correspondence, projector warmup stabilizes conditioning, and LoRA adaptation shifts output from verbose captions to structured clinical reports. Our code is publicly available for transparency and reproducibility: https://github.com/PRAISELab-PicusLab/BrainGemma3D.
11:15	Alessandro Pesci (Università Campus Bio-Medico di Roma, Italy) Valerio Guarrasi (Università Campus Bio-Medico di Roma, Italy) Marco Alì (CDI Centro Diagnostico Italiano S.p.A, Italy) Isabella Castiglioni (University of Milan-Bicocca, Italy) Paolo Soda (Università Campus Bio-Medico di Roma, Italy) A Systematic Benchmark of GAN Architectures for MRI-to-CT Synthesis ABSTRACT. The translation from Magnetic resonance imaging (MRI) to Computed tomography (CT) has been proposed as an effective solution to facilitate MRI-only clinical workflows while limiting exposure to ionizing radiation. Although numerous Generative Adversarial Network (GAN) architectures have been proposed for MRI-to-CT translation, systematic and fair comparisons across heterogeneous models remain limited. We present a comprehensive benchmark of ten GAN architectures evaluated on the SynthRAD2025 dataset across three anatomical districts (abdomen, thorax, head-and-neck). All models were trained under a unified validation protocol with identical preprocessing and optimization settings. Performance was assessed using complementary metrics capturing voxel-wise accuracy, structural fidelity, perceptual quality, and distribution-level realism, alongside an analysis of computational complexity. Supervised Paired models consistently outperformed Unpaired approaches, confirming the importance of voxel-wise supervision. Pix2Pix achieved the most balanced performance across districts while maintaining a favorable quality-to-complexity trade-off. Multi-district training improved structural robustness, whereas intra-district training maximized voxel-wise fidelity. This benchmark provides quantitative and computational guidance for model selection in MRI-only radiotherapy workflows and establishes a reproducible framework for future comparative studies. To ensure the reproducibility of our experiments we make our code public, together with the overall results, at the following link: https://github.com/arco-group/MRI_TO_CT.git
11:30	Tanguy Vansnick (University of Mons, Faculty of Engineering - ILIA, Belgium) Maxime Gloesener (University of Mons, Faculty of Engineering - ILIA, Belgium) Otmane Amel (University of Mons, Faculty of Engineering - ILIA, Belgium) Vito Tota (University of Mons, Department of Neuroscience, Belgium) Mathis Delehouzee (University of Mons, Faculty of Engineering - ILIA, Belgium) Saïd Mahmoudi (University of Mons, Faculty of Engineering - ILIA, Belgium) End-to-End Multimodal Transformers for Multi-Cohort Alzheimer's Classification ABSTRACT. Deep learning for Alzheimer’s disease (AD) classification from MRI has shown promise, yet recent reviews reveal that fewer than 20% of studies employ external validation, with data leakage (e.g., same subjects in train and test sets) inflating reported accuracies to 95-99%. Studies with rigorous subject-level splitting achieve 66-90%. We present an end-to-end multimodal deep learning framework that processes both 3D MRI scans and clinical tabular data. The framework combines a Vision Transformer (ViT) for imaging analysis with a Feature Tokenizer Transformer (FT-Transformer) for clinical features, fusing both modalities through bidirectional cross-attention. This multimodal fusion enables the model to leverage complementary information from structural brain imaging and patient clinical profiles. We validate on 6,065 subjects from three independent cohorts (ADNI, OASIS, NACC-SCAN) using strict subject-level 5-fold cross-validation. Our approach achieves 92.4% accuracy (AUC: 0.96) for distinguishing cognitively normal subjects from AD-trajectory patients (including MCI patients who later developed AD), and 93.3% on established AD cases. External validation across datasets reveals that single-cohort models suffer severe performance degradation (accuracy drops of 25-43 points), while our multi-cohort approach maintains robust generalization.
11:45	Iain Swift (Munster Technological University, Ireland) Jing Hua Ye (Munster Technological University, Ireland) Trimodal Deep Learning for Glioma Survival Prediction: A Feasibility Study Integrating Histopathology, Gene Expression, and MRI ABSTRACT. Multimodal deep learning has improved prognostic accuracy for brain tumours by integrating histopathology and genomic data, yet the contribution of volumetric MRI within unified survival frameworks remains unexplored. This pilot study extends a bimodal framework by incorporating Fluid Attenuated Inversion Recovery (FLAIR) MRI from BraTS2021 as a third modality. Using the TCGA-GBMLGG cohort (664 patients), we evaluate three unimodal models, nine bimodal configurations, and three trimodal configurations across early, late, and joint fusion strategies. In this small cohort setting, trimodal early fusion achieves an exploratory Composite Score (CS = 0.854), with a controlled $\Delta$CS of +0.011 over the bimodal baseline on identical patients, though this difference is not statistically significant (p = 0.250, permutation test). MRI achieves reasonable unimodal discrimination (CS = 0.755) but does not substantially improve bimodal pairs, while providing measurable uplift in the three-way combination. All MRI containing experiments are constrained to 19 test patients, yielding wide bootstrap confidence intervals (e.g., [0.400, 1.000]) that preclude definitive conclusions. These findings provide preliminary evidence that a third imaging modality may add prognostic value even with limited sample sizes, and that additional modalities require sufficient multimodal context to contribute effectively.
12:00	Ivon-Teresa Sánchez-Cárdenas (Bioinformatics Research Group in Epidemiology, Spain) Martha Ivón Cárdenas (Universitat Politècnica de Catalunya, Spain) Jose Urquiza (Universitat Politècnica de Catalunya, Spain) Predicting Pathologic Complete Response in Breast Cancer: A Multimodal Benchmark Using Baseline DCE-MRI and Clinical Data ABSTRACT. Breast cancer remains a major global health concern, and achieving a pathologic complete response (pCR) following neoadjuvant therapy is strongly associated with improved long-term outcomes. However, treatment response is highly heterogeneous, and the relative contribution of imaging and clinical variables remains unclear, particularly in limited-sample clinical trial cohorts. In this study, we present a reproducible benchmarking analysis within the I-SPY2 cohort to evaluate the predictive value of dynamic contrast-enhanced MRI (DCE-MRI) and structured clinical variables for pCR prediction. We train a 2.5D convolutional neural network on multi-phase DCE-MRI and extract imaging embeddings under both supervised and self-supervised learning (SSL) pretraining using SimCLR. For multimodal fusion, MRI embeddings are concatenated with clinical variables and used to train a logistic regression classifier. Models are evaluated at the patient level on a held-out test split using the area under the receiver operating characteristic curve (AUROC), complemented by class-level metrics including precision, recall, and F1 score for the pCR-positive class. The clinical-only model achieves the highest AUROC (0.733), while the multimodal model without SSL achieves the highest pCR recall of 0.73 (F1 = 0.57), substantially outperforming the clinical-only model in sensitivity to treatment responders (recall = 0.17). Self-supervised pretraining did not improve performance in either the imaging-only or multimodal setting. These findings demonstrate that clinical variables provide the dominant predictive signal, while multimodal fusion offers complementary sensitivity to the clinically relevant pCR class. The main contribution of this work is a transparent and reproducible benchmark that rigorously quantifies the incremental value of imaging beyond established clinical predictors, establishing strong baselines for future multimodal approaches on I-SPY2.
12:15	Francesco Di Serio (University of Naples Federico II, Italy) Michela Gravina (University of Naples Federico II, Italy) Vincenzo Moscato (University of Naples Federico II, Italy) Carlo Sansone (University of Naples Federico II, Italy) Consuelo Gonzalo-Martin (Universidad Politécnica de Madrid, Spain) Angel Garcia-Pedrero (Universidad Politécnica de Madrid, Spain) Evaluating a Multimodal Foundation Model for Glaucoma Classification from Fundus Images ABSTRACT. Glaucoma is a leading cause of irreversible blindness worldwide, and early detection from fundus photography remains a major clinical challenge. Deep learning models achieve strong performance but require large labeled datasets and often fail to generalize. Multimodal foundation models offer a potential alternative, enabling zero-shot and few-shot adaptation through natural language prompting. In this work, we present the first evaluation of MedGemma, a recently released medical vision-language model, for fundus-based glaucoma detection. Using the standardized SMDG-19 benchmark, we conduct a comparative evaluation of three strategies: zero-shot prompting, parameter-efficient fine-tuning via Low-Rank Adaptation (LoRA), and feature-based classification using embeddings extracted from the MedSigLIP vision encoder. Our results show that zero-shot inference achieves limited accuracy, highlighting the complexity of glaucoma detection in fundus images. In contrast, LoRA-based adaptation significantly improves performance, demonstrating the benefits of task-specific specialization. Notably, the best results are obtained when leveraging MedSigLIP visual embeddings within a dedicated classifier, suggesting that the intrinsic visual representations learned by the foundation model are highly discriminative for glaucoma screening. These findings highlight both the promise and the current limitations of foundation models for ophthalmic screening, underscoring the need for improved hybrid inference strategies and more effective multimodal integration.

12:30-14:00Lunch Break

Buffet Menu in Octagon Restaurant

14:00-17:00 Session 5A: Registration - Day 1

Location: Main Lobby

14:00-14:30 Session 5B: Opening Ceremony

Welcome Address

Prof. Constantinos S. Pattichis, University of Cyprus and CYENS Centre of Excellence, Cyprus

General Co-Chair

A snapshot of the Scientific and Social Program

Assoc. Prof. Efthyvoulos Kyriacou, Cyprus University of Technology, Cyprus

Program Co-Chair

Best Paper Competition – Finalists and Evaluation Process

Assoc. Prof. Rosa Sicilia, UniCamillus – Saint Camillus International University of Health Sciences, Rome, Italy

CBMS SC Chair

Location: Panorama

14:30-15:30 Session 6: Keynote Lecture - Day 1

Keynote title

Artificial Intelligence–Enabled Biomarkers for Personalized and Dynamic Pain Assessment

Metin Akay

John S Dunn Endowed Chair Professor

University of Houston, Department of Biomedical Engineering, Houston, TX, USA

Short Biography

Dr. Akay is the John S Dunn Endowed Professor of the Biomedical Engineering Department at the University of Houston. He earned his BS and MS degrees in Electrical Engineering (EE) from Bogazici University, Istanbul, Turkiye. Then, he pursued and obtained his PhD in Biomedical Engineering (BME) from Rutgers University, NJ, the United States,

He is proud of the BME department he established at the University of Houston, which is a unique translational research environment, focused on integrating innovative research and academic programs to meet the demands and requirements of the ever-changing global economy that continues to drive health-care technology, management, and delivery and served as the founding chair for nearly 15 years.

He received honorary doctorates from Aalborg Silesian and Pécs Universities and professorship from the Technical University of Crete. He has authored more than 20 books and 180 journal papers, along with 200 conference papers and abstracts and delivered over 200 keynote and plenary talks at respected international conferences, including IEEE ICASSP twice.

He is a recipient of the IEEE EMBS Career, Early Career and Service Awards, an IEEE Third Millennium Medal, and the prestigious Zworykin Award from the International Federation for Medical and Biological Engineering (IFMBE). He is a life fellow of IEEE, fellow of the Institute of Physics (IOP), the International Academy of Medical and Biological Engineering (IAMBE), the American Institute for Medical and Biological Engineering (AIMBE), and the American Association for the Advancement of Science (AAAS).

His research focuses on the development of novel therapeutics for the treatment of Cancer, neurotechnology for addiction and pain, brain cancer chips, and coronary occlusion.

Chair:

Efthyvoulos Kyriacou (Cyprus University of Technology, Cyprus)

Location: Panorama

15:30-16:00Coffee Break

Parquet Lobby Area (in the Main Lobby). Please enjoy your coffee and visit the Poster Area in the Panorama Room via the terrace.

16:00-17:30 Session 7A: Explainable and Generative AI for Clinical Decision Support

Chair:

Antonis Billis (Medical Physics and Digital Innovation Laboratory, Aristotle University of Thessaloniki, Thessaloniki, Greece, Greece)

Location: Panorama

16:00	Emmanouil Rigas (Aristotle University of Thessaloniki, Greece) Antonis Billis (AUTH, Greece) Panagiotis Bamidis (Aristotle University of Thessaloniki, Greece) Explainable AI in Medicine: Trends and Educational Implications ABSTRACT. The healthcare sector is facing rapid transformation through the extensive use of digital technologies and specifically Artificial Intelligence (AI). AI has the potential to drastically affect and improve the way medicine is provided to patients. Currently, several medical AI-based applications are under experimentation and some of them are already being used in the clinical practice. For the advancements that AI is bringing to be maximized, AI-based tools and applications need to be trusted by physicians and patients. In this context, techniques aiming to explain decisions made by AI tools have been developed, leading to the so-called Explainable Artificial Intelligence (XAI). In this paper, we conduct a bibliometric analysis of papers related to AI, XAI and medicine in order to describe the research landscape. Based on the bibliometric analysis results, we outline the need to educate physicians and healthcare professionals more broadly about these emerging technologies and propose insights into how this can be achieved.
16:10	João Vitor Mariano Correia (São Paulo State University, Brazil) João Renato Ribeiro Manesco (São Paulo State University, Brazil) Gabriel Lino Garcia (São Paulo State University, Brazil) Gabriela Chiuffa Tunes (Hospital Israelita Albert Einstein, Brazil) José Adenaldo Santos Bittencourt Junior (Hospital Israelita Albert Einstein, Brazil) Adriano José Pereira (Hospital Israelita Albert Einstein, Brazil) João Paulo Papa (São Paulo State University, Brazil) Faithfulness and Uncertainty Calibration of Large Language Models in Portuguese Medical Question Answering ABSTRACT. The deployment of Large Language Models (LLMs) in healthcare is constrained by limited transparency and susceptibility to factual errors. In clinical settings, predictive accuracy must be complemented by reliable uncertainty estimates and explanations that reflect the model’s internal decision process. This paper presents an evaluation framework for Portuguese medical LLMs using the DrBodeBench dataset. We evaluate multiple scales of the Qwen model family to examine uncertainty discrimination and attributional behavior. We compare Naive Entropy, Semantic Entropy, and self-verification via $P(\text{True})$ for hallucination detection. We further employ SHAP within a perturbation-based protocol to assess explanation faithfulness. Results indicate that instruction fine-tuning improves uncertainty discrimination as measured by ROC-AUC. Under our implementation, Semantic Entropy achieves the most consistent trade-off between discrimination and calibration across model scales. Perturbation analysis reveals systematic performance degradation under the removal of highly ranked tokens, suggesting partial alignment between attribution scores and decision-relevant features in Portuguese medical question answering.
16:20	Luis Marte (Eurecat, Spain) Oier Segura (Eurecat, Spain) Judith Recober (Eurecat, Spain) Laura Rivera-Sanchez (Stroke Unit, Vall d’Hebron Hospital Universitari, Spain) Carlos A Molina (Stroke Unit, Vall d’Hebron Hospital Universitari, Spain) Carolina Migliorelli (Eurecat, Spain) XAIqi and XAIci: Quantifying Explainability Quality and Task Complexity Across Predictive Models in Stroke Outcome Prediction ABSTRACT. Machine learning models are increasingly used in healthcare, yet similar predictive performance across models may conceal divergent explanations, introducing explanatory uncertainty in clinical decision-making. To address this challenge, we propose two novel metrics in the context of stroke care for predicting the National Institutes of Health Stroke Scale (NIHSS) at hospital discharge. The XAI Quality Index (XAIqi) quantifies the consistency and robustness of feature importance across heterogeneous models, identifying variables that remain relevant regardless of model architecture. The XAI Complexity Index (XAIci) characterizes task complexity based on the variability of explanatory patterns between models, reflecting how consistently a prediction task can be interpreted across algorithms. Using different machine learning algorithms we demonstrate how integrating explainability across models reduces model-specific artifacts and strengthens confidence in clinically meaningful predictors. Together, XAIqi and XAIci provide a unified framework for assessing explainability quality and task complexity in AI-driven stroke outcome prediction.
16:30	Camelia Maleki (PhD student, Department of Business Informatics,Ghent University, Belgium) Yannis Bertrand (Assistant professor, UHasselt, Faculty of Business Economics, Belgium) Frederik Gailly (Associate professor, Gent university, Department of Business Informatics, Belgium) Counterfactual Reasoning to Executable Clinical Guidelines: A DMN-Based Framework for Diabetes Risk Assessment ا ABSTRACT. This paper presents a hybrid decision-support framework that strengthens clinical guideline formalization by combining Decision Model and Notation (DMN) with machine-learning–based evidence and counterfactual reasoning, with the goal of improving transparency and uncertainty-aware adoption in healthcare decisions. The approach is demonstrated in the context of diabetes risk assessment. The guideline logic is formalized as DMN decision tables and operationalized as executable, auditable rule conditions. These rule conditions are then linked to data-driven evidence derived from predictive modeling to quantify outcome risk and support actionable “what-if” assessments. To provide actionable recourse aligned with guideline semantics, counterfactual sensitivity analyses are performed under feasible interventions on modifiable patient factors. Experiments use an NHANES-derived cohort restricted to the fasting subsample, and counterfactual scenarios are generated by decreasing body-mass index (BMI) and re-evaluating DMN rule outcomes to estimate corresponding changes in diabetes risk. Predicted risk decreases consistently as BMI is reduced, with the largest improvements concentrated among individuals near DMN decision thresholds, where small changes can alter rule firing and downstream risk. Overall, the framework complements DMN-based guideline formalization with empirically grounded evidence and counterfactual insights that remain interpretable and clinically actionable.
16:40	Anthony McCofie (University of South Florida, Computer Science and Engineering, United States) Lawrence Hall (University of South Florida, Computer Science and Engineering, United States) Yu Sun (University of South Florida, Computer Science and Engineering, United States) Dmitry Goldgof (University of South Florida, Computer Science and Engineering, United States) Prompt-Based Adaptation of Vision Language Models for Clinical Pain Note Generation from Neonate Cry Sound ABSTRACT. Accurate neonatal pain assessment remains challenging in clinical care, where documentation must be both timely and interpretable. We present a prompt-based method that adapts BLIP-2 to generate clinical pain notes from neonatal cry sounds with expert guided pain features. Cry recordings are converted to log-mel spectrograms, providing a visual representation of pain-related acoustic structure. BLIP-2 processes these spectrograms using a pretrained visual encoder and query-based cross-modal fusion, enabling image-conditioned language generation without task-specific retraining. With few-shot prompting, exemplar spectrograms paired with clinically meaningful 'pain' and 'no pain' descriptions guide the model to produce structured, human-readable notes that include an assessment outcome and salient cues such as high-frequency emphasis, intensity concentration, and temporal irregularity. Experiments show the framework produces consistent pain assessments under limited supervision, supporting AI-assisted neonatal pain documentation and decision support. The novelty impact of this paper is that it extends vision-language prompting into a clinical documentation setting by using neonate cry derived spectrograms not only for pain classification, but also for generating a structured clinical pain note.
16:50	Regina Silva (ISEP/GECAD, Portugal) Luis Gomes (ISEP/GECAD, Portugal) Carlos Sequeira (ESEP/RISE, Portugal) Goreti Marreiros (ISEP/GECAD, Portugal) Collaborative Intelligence in Mental Health: A Multi-Agent Framework for Personalized Treatment and Health Promotion using Next-Gen LLMs PRESENTER: Regina Silva ABSTRACT. The progressive increase in mental health disorders diagnosis demands proactive and holistic health promotion, as well as personalized symptom treatment. Personalized and holistic health care plans must be appropriate for the individual and integrate the biopsychosocial model. Previous work demonstrates promising capabilities with large language models. However, single-agent architectures generally lack the depth of reasoning required to generate comprehensive plans that respect ethics, privacy, and safety in healthcare. This paper proposes a large language model-based multi-agent system designed to generate personalized, holistic, and evidence-based health and care plans that encompass the mental health domain. An agent-based workflow was developed using the AutoGen framework. The architecture consists of four specialized agents. A dataset of 40 simulated clinical cases was used. The results demonstrate the proposed system's ability to generate comprehensive, holistic clinical and lifestyle plans arising from interactions among multidisciplinary agents. Demonstrating that this type of multi-agent architecture could become a useful tool to support healthcare professionals.
17:00	Iván Hernández Pérez (Universidad Politécnica de Madrid, Spain) Inés Pérez Sancristóbal (Hospital Universitario Severo Ochoa, Spain) Alejandro Rodríguez González (Universidad Politécnica de Madrid, Spain) Ernestina Menasalvas Ruiz (Universidad Politécnica de Madrid, Spain) RHEUMA-PRIOR-SCRIBE: Multi-Agent RAG for Rheumatology Consultation Prioritization ABSTRACT. Rheumatology anamnesis often yields long, redundant narratives that increase clinicians’ cognitive load and hinder timely prioritization of care. To address this issue, we present a guideline-grounded decision-support prototype that combines Retrieval-Augmented Generation (RAG) with an explicitly orchestrated Multi-Agent System (MAS) to transform Spanish free-text patient narratives into structured, clinician-ready outputs. The system indexes EULAR / ACR guideline content in a vector database and retrieves case-relevant passages at runtime to ground downstream reasoning. Specialized agents then extract core anamnesis parameters, assess completeness, identify alarm features, generate a concise clinical summary and an EHR-ready note, and assign a care-priority score (0–10) with justification, without producing diagnoses or treatment recommendations. We conducted a preliminary evaluation on 21 rheumatology consultation scenarios designed by a specialist, each accompanied by the expert reference priority value and the expected outputs. The proposed pipeline achieved a MAE = 0.62 and a RMSE = 1.1 for priority assignment relative to expert scores, while retrieval relevance averaged 3.0/5. These results support the feasibility of combining guideline-based retrieval with controlled multi-agent reasoning to produce auditable, structured anamnesis outputs and assist in consultation prioritization in rheumatology.
17:10	Phuong H. N. Dang (Auckland University of Technology, New Zealand) Samaneh Madanian (Auckland University of Technology, New Zealand) Minh Nguyen (Auckland University of Technology, New Zealand) Linguistic Speech Disfluencies: A Gender-Neutral Biomarker for Speech-Based Anxiety Detection ABSTRACT. This study investigates whether linguistic speech disfluencies, such as cognitive markers of anxiety, serve as gender-neutral biomarkers that enable equitable anxiety screening while transcending the sexual dimorphism of acoustic features. We analyzed the DAIC-WOZ clinical interview corpus to train a Random Forest classifier for detecting high-anxiety states based on Patient Health Questionnaire-8 scores (binary threshold:PHQ-8geq10). We conducted a systematic gender fairness audit for anxiety detection that jointly examines gender disparities, linguisticfeature robustness, and their interaction under controlled bias evaluation protocols. Our findings demonstrate that thoughtful feature engineering, grounded in domain knowledge about which features should theoretically be demographic-invariant, can be as effective as complex algorithmic interventions while providing greater transparency and interpretability. We urge authors of high-accuracy models to retroactively audit fairness on published results and recommend that journals require fairness evaluation sections in all submissions. We provide empirical validation of this hypothesis and demonstrate a practical pathway toward deploying fair AI systems in clinical mental healthcare.
17:20	Zineb Fandi (Université Toulouse Capitole, France) Moncef Garouani (Université Toulouse Capitole, France) Jean-Philippe Pradère (Institut RESTORE, University of Toulouse, France) From Behavioral Tracking to Aging Biomarkers : An Explainable Machine Learning Framework on the African Turquoise Killifish ABSTRACT. Identifying reliable predictive biomarkers of aging is critical for understanding functional decline across biological systems. The African Turquoise Killifish (ATK), a naturally short-lived vertebrate, offers a unique opportunity to study aging dynamics over a compressed lifespan. We present a spatio-temporal machine learning (ML) framework to extract and analyze behavioral signatures of aging from longitudinal locomotor recordings of the ATK. We analyze swimming trajectories from a cohort of 92 fish aged 5–32 weeks, integrating both lateral and dorsal recording perspectives. By segmenting time series at multiple temporal scales, we quantify how short- and long-term behavioral patterns relate to age and sex. Predictive models reveal that morphological stability appears to structure the global age separation, while locomotor dynamics capture progressive transitions across aging stages. Crucially, Shapley additive explanations highlights a set of behavioral patterns that are robust across temporal resolutions, providing candidate biomarkers for aging in short-lived vertebrates. Our analysis demonstrates that spatio-temporal behavioral dynamics capture meaningful biological variation, offering insight into the progression of functional decline.

16:00-17:30 Session 7B: Medical Image Segmentation and Reconstruction

Chair:

Efthyvoulos Kyriacou (Cyprus University of Technology, Cyprus)

Location: Atrium A

16:00	Mónica Chillarón (Universitat Politècnica de València, Spain) Celia Tendero (Universitat Politècnica de València, Spain) Josep Arnal (Universidad de Alicante, Spain) Vicente Vidal (Universitat Politècnica de València, Spain) Attention U-Net with Algebraic Refinement for Sparse-View CT Reconstruction ABSTRACT. Sparse-view Computed Tomography is a well-known dose reduction strategy. Algebraic methods can achieve reconstructions with highly undersampled data where analytical methods such as FBP often fail, though at a higher computational cost. However, a major challenge in this approach is the generation of streak artifacts that significantly worsen the quality of reconstructed images when a very low number of projections is used. In addition, the ill-conditioned nature of the problem also causes slow convergence and requires thousands of iterations. This is why complementary strategies such as regularization or filtering are needed to improve the stability of the problem. This paper proposes a hybrid reconstruction framework that uses an Attention U-Net to obtain an initial solution for an iterative algebraic reconstruction process. Specifically, the network is fed with low-quality reconstructions using few iterations of an algebraic method and generates improved images that are used as the initial solutions of the iterative method, improving both quality and numerical convergence. The data used for this study has been selected from the chest studies of the DICOM-CT-PD dataset. Results show a substantial improvement in reconstruction quality with extreme undersampling (33 projections for 256x256 pixel resolution, where the minimum required by the Nyquist theorem is 400). Specifically, by using the hybrid framework, SSIM increases from ≈ 0.85 (obtained with 200 iterations of the iterative method alone) to ≈ 0.96. Visually, the network effectively suppresses the streak artifacts that previously dominated the image and preserves structural details effectively, and the algebraic refinement improves numerical consistency. Although soft tissue areas with lower contrast still show structural inaccuracies, this framework provides a solid basis for future advances.
16:15	Dimitra-Christina Koutsiou (University of Thessaly, Dept of Computer Science and Biomedical Informatics, Greece) Michalis Savelonas (University of Thessaly, Dept of Computer Science and Biomedical Informatics, Greece) Fotini Malli (University of Thessaly, Dept. of Nursing, Greece) Dimitris Iakovidis (University of Thessaly, Dept of Computer Science and Biomedical Informatics, Greece) Level-set-guided CNN-based Segmentation of CTPA Scans for Pulmonary Emboli Extraction ABSTRACT. Pulmonary embolism (PE) is a severe cardiovascular condition that requires prompt and accurate diagnosis to prevent life-threatening complications. Computed tomography pulmonary angiography (CTPA) is the primary imaging modality for PE diagnosis. However, the manual analysis of CTPA scans remains challenging due to high anatomical variability, low contrast, and limited annotated data. Recent advances in deep learning (DL) have shown promise in automating CTPA image segmentation for PE boundary extraction. However, challenges remain, associated with the small size of PE and the associated class imbalance within CTPA scans, as well as with the generalization capability that is affected by limited available annotations. In this work, we introduce a DL-based segmentation method that integrates a novel loss function variant, the adaptive level-set (ALS) loss, which encompasses spatial and boundary information. The ALS loss aids the model to cope with the small size of PE and the associated class imbalance within CTPA images, whereas it acts as an inherent noise filtering mechanism and strengthens generalization capability. The experimental evaluation, conducted on public PE datasets, demonstrates that the proposed method achieves enhanced segmentation performance and generalization capability, when compared to state-of-the-art methods.
16:30	Yannis Petitpas (Univ. Bordeaux, CNRS, Inria, Bordeaux INP, IMB, UMR 5251, F-33400 Talence, France, France) Ilyes Benlala (CHU de Bordeaux, CRCTB, INSERM, U 1045, F-33000 Bordeaux, France, France) Fabien Baldacci (Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, F-33400 Talence, France, France) Gael Dournes (CHU de Bordeaux, CRCTB, INSERM, U 1045, F-33000 Bordeaux, France, France) Jean Claude Pairon (Faculté de Santé, Université Paris-Est Créteil, 94000 Créteil, France, France) Baudouin Denis de Senneville (Univ. Bordeaux, CNRS, Inria, Bordeaux INP, IMB, UMR 5251, F-33400 Talence, France, France) Weakly supervised pleural plaque segmentation using global patient-level diagnostic cues PRESENTER: Yannis Petitpas ABSTRACT. AI-driven approaches have been proposed for pleural plaque (PP) segmentation from computed tomography (CT) scans, aiming to produce voxel-wise binary masks. While these models show strong potential for reproducible PP segmentation, they often struggle to capture small, thin, or morphologically variable plaques. Furthermore, models trained predominantly on PP-positive cohorts, without inclusion of healthy controls, tend to exhibit local bias and limited generalization capability. This study investigates whether incorporating simple, globally accessible patient-level information (specifically, the radiologist assessed presence or absence of PP) can enhance segmentation performance. We propose a framework that augments a pre-trained segmentation model with a lightweight deep correction module that leverages global diagnostic information to refine local PP segmentation outputs. The results demonstrated the framework’s ability to improve the reliability of traditional segmentation tools for the automated assessment of PP disease. This was achieved by leveraging globally accessible patient-level information, rather than relying on labor-intensive local delineations at the individual plaque level.
16:45	Evropi Toulkeridou (CYENS, Cyprus) Elena Michaelides (417 NIMTS Army Share Fund Hospital, Greece) Elmejrab Ziad (Mediterranean Hospital of Cyprus, Cyprus) Panagiota Kosmidou (Mediterranean Hospital of Cyprus, Cyprus) Andreas Panayides (CYENS Center of Excellence, Cyprus) Automated Trachea Segmentation from CT Imaging Using AI Models ABSTRACT. Accurate trachea segmentation from computed to- mography (CT) is a prerequisite for image-guided airway assess- ment, precision tracheostomy planning, and safe endotracheal tube placement. The trachea presents distinct segmentation challenges due to its elongated morphology, small cross-sectional area, sensitivity to partial-volume effects, motion artifacts, and heterogeneous surrounding mediastinal structures. This study systematically compares two complementary paradigms: a fully automatic self-configuring framework (nnU-Net) and a prompt- conditioned foundation model (MedSAM) derived from the Seg- ment Anything Model (SAM). Evaluation is performed under heterogeneous dataset regimes including volumetric CT data with consistent inter-slice continuity and slice-based CT data lacking reliable volumetric structure. A hybrid inference strategy enabling automatic prompt generation is introduced. Quantita- tive and qualitative analyses demonstrate that dataset structure critically influences segmentation reliability, boundary stability, and deployment feasibility in precision airway workflows.
17:00	K S Nithurshen (Shiv Nadar University, India) Saurabh Shigwan (Shiv Nadar University, India) SpineContextResUNet: A Computationally Efficient Residual UNet for Spine CT Segmentation ABSTRACT. Automated segmentation of the vertebral column in Computed Tomography (CT) scans is a prerequisite for pathological assessment and surgical planning. However, state-of-the-art methods, particularly those based on Transformers or large-scale ensembles, demand substantial GPU resources, creating a barrier for clinical adoption in resource-constrained environments or on edge devices. To address this, we introduce SpineContextResUNet, a computationally efficient 3D Residual U-Net designed for rapid spinal localization. Our architecture integrates a lightweight Context Block that employs parallel multi-dilated convolutions to capture long-range anatomical dependencies without the high latency of Recurrent Neural Networks (RNNs) or the memory overhead of Self-Attention mechanisms. Extensive validation on two public benchmarks, VerSe2020 and CTSpine1K, demonstrates that our model achieves a Dice score of 88.17% and 88.13% respectively. To evaluate performance under strict hardware constraints, we compared our model against a bottlenecked SwinUNETR scaled to match our 1.7M hardware footprint. While the constrained Transformer suffers severe performance degradation due to a lack of spatial inductive biases in a limited-data regime, our CNN-based approach successfully maintains high accuracy. Crucially, heavy baselines like TotalSegmentator fail due to memory exhaustion on commodity hardware (Intel Core i5, 8GB RAM), our model performs robust inference, making it a viable solution for point-of-care diagnostics and deployment on edge platforms like the Nvidia Jetson Orin Nano.
17:15	Giuseppe Placidi (A2VI-Lab, c/o Department of MeSVA, University of L'Aquila, Italy) GAN-based Adaptive Radial Subsampling and Reconstruction for Brain MRI ABSTRACT. Magnetic Resonance Imaging (MRI) is one of the main medical imaging modalities. Radial k-space sampling, due to its inherently dense coverage of the central k-space region, is particularly effective in mitigating motion-related artifacts through averaging effects, and consequently, is frequently used in MRI. However, fully sampled images could take a long acquisition time, thus necessitating undersampling. Undersampling is com- monly implemented randomly and blindly, in accordance with the requirements of Compressed Sensing (CS), often resulting in blurred images and a low compression ratio. The amount of data required by CS can be reduced, and the quality of image reconstruction improved if a-priori information about the underlying image is collected during the sequential acquisition process. In this work, we introduce a Generative Aversarial Neural Network (GAN)-based Adaptive Radial Subsampling and Reconstruction (GASSR) for brain MRI, an iterative adaptive acquisition/reconstruction technique for radial sparse sampling, capable of collecting the minimal and most informative set of radial trajectories by alternating iterative sampling, reconstruction, information evaluation, and the acquisition of new radial directions, based on the information content of the reconstructed image. Preliminary results indicate that GASSR effectively reduces data redundancy and achieves rapid convergence to high-quality images, outperforming similar SOTA models.

16:00-17:30 Session 7C: Emerging Computational Intelligence Methods in Healthcare

Chair:

Nicoletta Prentza (University of Cyprus, Cyprus)

Location: Atrium B

16:00	Ioanna Valiandi (Research Associate, CYENS Centre of Excellence, PhD Student, Cyprus University of Technology, Cyprus) Efthyvoulos Kyriacou (Department of Electrical Εngineering and Computer Science and Engineering, Cyprus University of Technology, Cyprus) Venkatesh Jatla (Department of Electrical and Computer Engineering, University of New Mexico, United States) Marios S. Pattichis (Department of Electrical and Computer Engineering, University of New Mexico, United States) Andreas S. Panayides (CYENS Centre of Excellence, Cyprus) Comparative Performance Evaluation of Contemporary Video Coding Standards, including DCVC-RT and ECM, on 2D and 360° Medical Video Datasets ABSTRACT. Medical video streaming is increasingly important to digital healthcare systems and services, including remote diagnosis, immersive training and virtual reality (VR) simulations. In these applications, both compression efficiency and delivery latency must be carefully evaluated as they may affect diagnostic confidence and the fidelity of medical education. This paper presents a comparative performance evaluation based on objective video quality assessment metrics of encoding efficiency and speed, of representative conventional, enhancement-based and neural video codecs across two heterogeneous medical video datasets. More specifically, a low-resolution 2D ultrasound (560x448,40fps) and a high-resolution 360o emergency simulation dataset (8K-class, 7860x3840, 30fps). The study includes seven video encoding standards, namely DCVT-RT, ECM, VVC, SVT-AV1, x265, hevc_nvenc and lcevc_hevc. Objective video quality is assessed using PSNR, SSIM AND VMAF, while encoding efficiency is assessed using BD-Rate analysis. The findings show that ECM, tested on low-resolution ultrasound videos, outperforms all other encoders, supporting its potential as a foundation for next-generation H.267 standardization. In addition, the neural video codec DCVC-RT demonstrates particularly strong encoding performance on the high-resolution 360o dataset, where ECM and VVC were not evaluated. Leveraging GPU acceleration, DCVC-RT, achieves both high compression efficiency and fast encoding, making it a suitable solution for real-time encoding. Furthermore, the GPU accelerated hevc_nvenc implementation, significantly improves time efficiency compared with conventional encoders, highlighting practical trade-offs between compression gains and encoding time. Overall, the results confirm the practical relevance and suitability of neural video compression and next-generation coding technologies for real-time medical video streaming applications.
16:15	Nicoletta Prentzas (Department of Computer Science University of Cyprus, Cyprus) Efthyvoulos Kyriacou (Cyprus Univesrity of Technology, Cyprus) Maura Griffin (Vascular Screening and Diagnostic Center,, Cyprus) Andrew Nicolaides (University of Nicosia Medical School, Cyprus) Antonis Kakas (Department of Computer Science University of Cyprus, Cyprus) Constantinos Pattichis (Department of Computer Science University of Cyprus, Cyprus) Explainable temporal analysis for high-risk carotid plaques using ArgEML – a pilot study ABSTRACT. Advancing recent research in Explainable AI for carotid plaque risk assessment, this pilot study applies the Argumentation-based Explainable Machine Learning (ArgEML) framework to investigate the temporal prediction of ischemic events in symptomatic patients. While prior research identified high-risk asymptomatic plaques, this work shifts focus to the clinical necessity of distinguishing between ‘early stroke’ and ‘late stroke’ outcomes to optimize surgical intervention windows. By integrating sub-symbolic machine learning with symbolic logical argumentation, the study extracts decision rules from ultrasonic plaque features and generates a transparent argumentation theory. By utilizing the Explanation Space, the model identifies clinical dilemmas where evidence for stroke timing is ambiguous, providing transparent, human-like justifications. This approach aims to help clinicians identify patient profiles requiring further diagnostic analysis, ultimately fostering the trust and accountability necessary for AI adoption in high-stakes healthcare. The ArgEML learned theory demonstrated predictive reliability comparable with a statistical machine learning model. Future work will focus on validating this approach with clinicians using larger cohorts.
16:30	Saba Bashir (School of Computing, JCDM, DePaul University, United States) Zonglin Yang (School of Computing, JCDM, DePaul University, United States) Thiruvarangan Ramaraj (School of Computing, JCDM, DePaul University, United States) Controlled Factorial Analysis of Architecture–Loss Interactions in Cardiac MRI Segmentation ABSTRACT. Cardiac MRI segmentation is essential for quantifying cardiac structure and function. The ACDC dataset [3] provides a standardized benchmark for this task. While CNNs like U-Net are widely used, transformer-based models such as TransUNet have emerged as alternatives. However, controlled comparisons under identical conditions remain lacking, making it unclear whether performance gains arise from architectural complexity or loss function design. This study conducts a controlled 2 by 6 factorial experiment comparing U-Net and TransUNet across six loss configurations on the ACDC benchmark, with five random seeds per condition and identical preprocessing, augmentation, and optimization protocols. Region-aware losses performed significantly better than CE in both architectures, with mean Dice improvements of 0.5-1.5 percentage points. Under CE loss, U-Net achieves slightly higher Dice scores than TransUNet; however, this gap narrows with advanced losses and becomes non-significant after correction (e.g., Region loss raw p = 0.071, adjusted p = 0.428). Two-way ANOVA reveals that loss function explains a larger proportion of variance than architecture on both validation (η2 = 0.908 vs. 0.512) and test (η2 = 0.427 vs. 0.277) sets, with significant interaction (p = 0.019) indicating optimal loss depends on architecture choice. Notably, a three-fold increase in model parameters (U Net: 32.5M vs. TransUNet:105.3M) yields gains comparable to those achievable through loss function design, suggesting that architectural complexity alone does not guarantee improved performance on small-to-moderate datasets. However, Hausdorff-based losses increase training time by approximately 3-10 times without proportional performance gains, highlighting an efficiency-accuracy trade-off. These findings suggest that loss function design warrants attention comparable to architectural innovation for cardiac MRI segmentation on small-to-moderate datasets.
16:45	Jordi Doménech (Universitat Politècnica de Catalunya (UPC), Spain) Michael Pilgermann (Technische Hochschule Brandenburg (THB), Germany) Josep Pegueroles (Universitat Politècnica de Catalunya (UPC), Spain) Between Aspiration and Reality – Datasets for Network Security in Healthcare PRESENTER: Jordi Doménech ABSTRACT. The reliability of network Intrusion Detection Systems (IDS) research in healthcare mainly depends on the availability and realism of network security datasets. However, due to strict privacy regulations and the sensitivity of patient data, real hospital traffic is rarely accessible. For that reason, current researchers rely heavily on publicly available datasets, which are typically synthetically generated or based on simulated devices in laboratory environments. This study presents an analysis of real network traffic captured from a hospital environment and a network feature-based comparison between the real environment and two publicly available datasets. The results indicate that several window-based features exhibit major shifts between real and synthetic datasets. In contrast, flow-based features present greater stability across publicly available datasets and the real hospital network. These findings reveal that while public datasets capture some general properties of hospital traffic, they fail to reproduce its full heterogeneity and temporal dynamics. The study highlights the need for more realistic data generation and validation methods to improve the reliability and transferability of IDS solutions in healthcare environments.
17:00	Mahmood Alzubaidi (Hamad Bin Khalifa University, Qatar) Raden Muaz (Hamad Bin Khalifa University, Qatar) Ines Abbes (Hamad Bin Khalifa University, Qatar) Khalid Alyafei (SIDRA Medicine, Qatar) Mowafa Househ (Hamad Bin Khalifa University, Qatar) Marco Agus (Hamad Bin Khalifa University, Qatar) AMAL-For-Qatar: A Vision for AI-Mediated Prenatal Care through Comprehensive Fetal Ultrasound Video Analysis ABSTRACT. While individual AI components for fetal ultrasound analysis (segmentation models, super-resolution techniques, vision-language models, and automated reporting systems) have demonstrated impressive standalone performance, no unified framework exists for orchestrating these disparate technologies into a cohesive clinical decision support system (CDSS). This paper presents the AMAL Integration Architecture, a novel framework that addresses this critical gap by defining standardized interfaces, data flow protocols, and feedback mechanisms that enable heterogeneous AI modules to communicate and collaborate within a single end-to-end pipeline. Unlike existing approaches that optimize individual components in isolation, our architecture introduces: (1) a shared representation layer that bridges segmentation, quality assessment, and reporting modules; (2) a bidirectional feedback mechanism where downstream clinical validation informs upstream frame selection; and (3) a safety orchestration layer that aggregates confidence scores across modules to trigger human-in-the-loop intervention. We demonstrate the feasibility of this proposed integration through our foundational proof-of-concept implementations (FetSAM segmentation, FADA reporting, super-resolution enhancement) and present a comprehensive architectural blueprint for a unified prenatal CDSS. This integration-first approach represents a paradigm shift from component-level optimization to system-level orchestration in AI-assisted prenatal care.

16:00-17:30 Session 7D: Intelligent Sensing

Chair:

Mario Cannataro (University “Magna Graecia” of Catanzaro, Italy)

Location: Atrium C

16:00	Andy Gong (University of Toronto, Canada) Matthew Tamura (University of Toronto, Canada) Efe Tascioglu (University of Toronto, Canada) Andrew Wu (University of Toronto, Canada) Alexander Vicol (Department of Electrical and Computer Engineering, University of Toronto, Canada) Steve Mann (Department of Electrical and Computer Engineering, University of Toronto, Canada) XRaudinary: Spatially Anchored Live Captions in XR via Low-Cost Microphone Arrays and Vision Fusion ABSTRACT. People with hearing loss often face high cognitive load and reduced participation in group conversations, especially in noisy or reverberant environments. We present XRaudinary, an in-progress XR system that spatially anchors live captions to the direction of sound within the user's vision, by combining a wearable, low-cost microphone array with the user's real-time vision. An ESP32 Microcontroller Unit (MCU) ingests synchronized I2S MEMS microphone streams, forwarding them to a server that estimates time-differences-of-arrival for direction-of-arrival inference, and forwards post-processed directionality and captions to a VR/AR headset. Constraining the sound along the planar axis of the user's vision resolves geometric ambiguities inherent to small arrays, enabling real-time captions that appear at the correct location in the user's field of view. We describe the system architecture, sound source localization approach, and appropriateness for real-time conversational contexts.
16:10	Eleftherios G. Vourkos (Department of Mechanical and Manufacturing Engineering. University of Cyprus., Cyprus) Martha Ivón Cárdenas (Universitat Politècnica de Catalunya, Spain) Geometry-Aware Event-Based TinyML for Interpretable On-Device Rehabilitation Metrics ABSTRACT. Wearable rehabilitation systems increasingly rely on inertial sensors and machine learning (ML) to monitor posture and movement quality. However, most existing solutions execute inference continuously or apply independent scalar thresholds, limiting battery life and failing to capture the coupled dynamics of clinically valid motion patterns. We propose a geometry-aware, event-driven TinyML framework in which biomechanical deviations are modeled as compact regions in joint-angle and angular-velocity space. Instead of continuous inference, computation is triggered only when the kinematic state exits this admissible region. A prototype implementation on an STM32L4 microcontroller using TensorFlow Lite Micro demonstrates a 72.7% reduction in inference calls and a 37.3% reduction in average current consumption. The system preserves interpretable rehabilitation metrics while operating fully on-device with minimal computational overhead, supporting long-term autonomous wearable operation.
16:20	Matija Markulin (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Luka Matijević (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Luka Šiktar (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Janko Jurdana (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Branimir Ćaran (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Filip Šuligoj (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Bojan Šekoranja (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Marko Švaco (Faculty of Mechanical Engineering and Naval Architecture, University of Zagreb, Croatia) Motion Compensation for Ultrasound Scanning in Robotically Assisted Prostate Biopsy Procedures ABSTRACT. Prostate cancer is one of the most common types of cancer in men. Diagnosis via biopsy requires a high level of surgical expertise and precision, making the results highly operator-dependent. The aim of this work is to develop a robotic system for assisted ultrasound (US) examination of the prostate, a prebiopsy step that reduces dexterity requirements and enables faster, more accurate, and more accessible prostate biopsy. We developed and validated a laboratory setup with two robots: one that autonomously scans a prostate phantom and another that carries the phantom to simulate patient movement. The scanning robot maintains the relative position of the US probe and the prostate phantom, ensuring a consistent and robust approach to reconstructing the scanned prostate. To reconstruct the prostate, each slice is segmented to generate a series of prostate contours, which are converted into a 3D point cloud used for biopsy planning. The average scan time of the prostate was 30 s, and the average 3D reconstruction time was 3 s. We performed three motion scenarios and registered them with the stationary case. ICP registration with a threshold of 1.2 mm yielded a mean fitness of over 90% for each motion type. Due to the elastic and soft material properties of the prostate phantom, the maximum robot tracking error was 3 mm, which is considered sufficient for prostate biopsy according to medical literature.
16:30	Aggeliki Palia (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Alexandros Moraitopoulos (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Konstantinos Mitsopoulos (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Niki Pandria (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Vasiliki Mantiou (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Vicky Fiska (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Apollon Zoiros (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Alkinoos Athanasiou (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Panagiotis Bamidis (Lab of Medical Physics & Digital Innovation, AUTH, Greece) Contactless Blood Pressure Estimation via Smart Mirror using Pulse Transit Time ABSTRACT. As population aging accelerates, modern technologies for health monitoring and ambient assisted living (AAL) are becoming increasingly integrated in everyday life. However, many individuals -particularly older adults- experience difficulties and discomfort in using such technologies. For this reason contactless techniques for measuring and monitoring vital signs, that rely on user-friendliness and unobtrusiveness are being actively developed. In this context, a smart biomedical mirror for the remote monitoring of vital signs was developed as a potential solution. Building on this work, we implemented a remote blood pressure estimation algorithm using the system’s existing remote photoplethysmography (rPPG) pipeline. The preliminary evaluation demonstrated the feasibility of integrating remote blood pressure estimation within the smart mirror platform using the existing rPPG pipeline. The obtained results provide an initial proof-of-concept and highlight important considerations for improving camera-based blood pressure estimation in future work.
16:40	Vinzent Bücheler (DigiHealth Institute, Neu-Ulm University of Applied Sciences, Neu-Ulm, Germany) Tassilo Dege (Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany) Christofer Pohl (DigiHealth Institute, Neu-Ulm University of Applied Sciences, Neu-Ulm, Germany) Daniel Hieber (Institute of Neuropathology, University Medical Center Ulm, Faculty of Medicine, Ulm University, Ulm, Germany) Vanessa Borst (Institute of Computer Science, University of Würzburg, Würzburg, Germany) Rüdiger Pryss (Institute of Computer Science, University of Würzburg, Würzburg, Germany) Astrid Schmieder (Department of Dermatology, Venereology and Allergology, University Hospital Würzburg, Würzburg, Germany) Johannes Schobel (DigiHealth Institute, Neu-Ulm University of Applied Sciences, Neu-Ulm, Germany) A Highly Customizable Platform Enabling Sophisticated Medical Eye-Tracking Studies ABSTRACT. Eye-tracking has demonstrated significant potential for analyzing visual attention in medical research. Nevertheless, technical challenges persist in establishing robust and customizable study designs ensuring reproducible data collection in complex scenarios. In prior work, we proposed a vendor-agnostic eye-tracking platform for conducting studies in digital (neuro)pathology. However, the initial platform was validated in an expert-supervised setting and lacked systematic evaluation of interaction design, user onboarding, and data quality mechanisms. Additionally, it exhibited a high degree of specificity, being closely associated with a particular medical domain. In order to address these limitations, a comprehensive set of cross-domain requirements was elicited and incorporated into a requirement-driven redesign resulting in a refined modular eye-tracking platform that was subsequently implemented and empirically validated. The redesigned platform contributes configurable multi-class task handling, flexible study flow components, and integrated ground truth–based feedback mechanisms. The platform was evaluated with 14 physicians from dermatology performing multi-class wound classification under eye-tracking conditions. Quantitative fixation analyses with respect to labeled regions of interest were combined with standardized usability assessment using the System Usability Scale (SUS). Findings indicate the system's dependable performance in a clinical setting and its exceptional usability, as evidenced by an average SUS score of 83.75. The findings of this study indicate that the refined platform enables standardized, reproducible, and domain-agnostic eye-tracking studies while significantly lowering technical barriers for study execution in clinical research environments.
16:50	Alexander Vicol (University of Toronto, Canada) Gianella Bejar-Alvarez (University of Toronto, Canada) Steve Mann (University of Toronto, Canada) State-of-Study in Hyperbaric Oxygen Therapy: Wearable EEG and fNIRS During a Single HBOT Session ABSTRACT. Hyperbaric oxygen therapy (HBOT) exposes the body to elevated partial pressures of oxygen inside a pressurized chamber, a setting that may influence cerebral hemodynamics and neural oscillatory activity. We report a single-participant feasibility study using two Muse-class wearable headbands (Muse 2 for pre-session baseline; Muse S Athena for in-chamber recording) to simultaneously capture 4-channel EEG (256 Hz) and, on the Athena device, 8-channel near-infrared spectroscopy (fNIRS) at 730 nm and 850 nm. Using a reproducible pipeline (0.5–30 Hz bandpass, 5 s epoching, amplitude-based artifact rejection, Welch PSD), we compare spectral features before and during HBOT. Pooled theta/beta ratio (TBR) increased from 2.39 ± 2.23 (316 epochs, pre-session) to 3.56±3.64 (383 epochs, in-chamber), a relative increase of 49%. Frontal alpha asymmetry (FAA) shifted from +1.22 to near zero. During the HBOT session, fNIRS-derived hemodynamic estimates at inner optode positions showed positive ∆HbO and negative ∆HbR, consistent with increased cortical oxygenation. Epoch-level correlation analysis revealed a significant positive association between TBR and ∆HbO at both left-inner (r = 0.16, p = 0.002) and right-inner (r = 0.14, p = 0.008) positions. These results are preliminary and demonstrate feasibility of concurrent wearable EEG + fNIRS monitoring during HBOT, along with a reusable analysis scaffold for future controlled studies.

16:00-17:30 Session 7E: Special Track on Management and Quality of Data Lifecycle in Health and Medicine

Chairs:

Panagiotis Bamidis (Aristotle University of Thessaloniki, Greece)
Evdokimos Konstantinidis (Aristotle University of Thessaloniki, Greece)
Despoina Petsani (Medical Physics, Digital Innovation Lab, School of Medicine, Aristotle University of Thessaloniki, Greece)

Location: Megaron B

16:00	Styliana Siakopoulou (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Antonios Billis (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Georgios Petridis (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Petros Nicopolitidis (Aristotle University of Thessaloniki, Greece) Panagiotis Bamidis (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Assessing the Quality of Patient-Generated Health Data: the DAMA UK Framework Applied to Real-World Cancer Patient Data ABSTRACT. Wearable devices generate large volumes of patient-generated health data (PGHD), yet their reliability in real-world healthcare applications remains uncertain due to variability in data quality and user adherence. Despite growing interest in wearable health data, standardized approaches for assessing their quality in healthcare remain limited. This study proposes a structured data quality assessment framework for wearable healthcare data based on the DAMA UK Data Quality Framework and demonstrates its application using a real-world dataset. The dataset consisted of Fitbit data from 120 older cancer survivors in the 90-day LifeChamps multicenter study. Selected variables included daily step count, sleep duration, and minute-level intraday heart rate measurements. Four data quality dimensions (completeness, timeliness, uniqueness, and validity) were assessed at dataset and patient level. Dataset-level completeness was 100% for summary variables, and 75.87% for intraday heart rate records. At the patient-level, an average of 62.11% of records per participant were complete. Dataset-level timeliness reached 73.68%, and median patient-level timeliness was 84.44%. Duplicate records accounted for 4.74% of the dataset, and conflicting intervals were 0.03%. Validity checks indicated that there were no physiologically implausible values, with only 0.01% of intraday heart rate values exceeding age-predicted maximum thresholds, and 3.65% sleep duration records over 9 hours. Overall, wearable data quality was acceptable but varied substantially across individuals. Multi-level and context-aware data quality assessment is essential when wearable PGHD are used for clinical research.
16:15	Alexandra Panghe (Aix Marseille Université, Inserm, IRD, SESSTIM, ISSPAM, Marseille, France, France) Louis Tassy (Department of Medical Oncology, Institute Paoli Calmettes, Marseille, France, France) Chang Sun (Department of Advanced Computing Sciences, Maastricht University, Maastricht, Netherlands, Netherlands) Camelia Protopopescu (Aix Marseille Université, Inserm, IRD, SESSTIM, ISSPAM, Marseille, France, France) Raquel Urena (Aix Marseille Université, Inserm, IRD, SESSTIM, ISSPAM, Marseille, France, France) Benchmark Pipeline for Missing Data Imputation in Oncological Electronic Health Records PRESENTER: Alexandra Panghe ABSTRACT. Cancer-related Electronic Health Records (EHRs) integrate structured clinical variables, such as diagnoses, prescribed medications, laboratory measurements, procedures, detailed longitudinal data documenting clinical visit histories. While this heterogeneity enables large-scale observational research, workflow-driven missingness and inconsistent data capture complicate preprocessing and integration, potentially biasing statistical inference, survival modeling, and patient stratification. Robust and reproducible imputation strategies are therefore essential for reliable oncology analyses. In this work, we evaluate whether a standardized preprocessing pipeline, combining data harmonization, variable selection, outlier detection, and imputation, improves robustness in survival-oriented oncology studies. We systematically compare statistical approaches (Multiple Imputation by Chained Equations), machine learning methods (MissForest and k-nearest neighbors), and deep generative models (Variational Autoencoders and Generative Adversarial Imputation Networks). Experiments are conducted on a real-world cohort of breast cancer patients assembled from four hospital registries at Institut Paoli-Calmettes, in Marseille, France, characterized by substantial and heterogeneous missingness. We assess reconstruction performance across methods and analyze how imputation choices propagate to downstream survival modeling and patient stratification. Our findings provide practical guidance for selecting imputation strategies in real-world cancer EHR studies, highlighting trade-offs between reconstruction accuracy and clinical validity while promoting reproducible analytical workflows.
16:25	Styliana Siakopoulou (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Sokratis Nifakos (Department of Learning, Informatics, Management and Ethics (LIME) Karolinska Institutet, Sweden) Antonios Billis (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Evdokimos Konstantinidis (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Panagiotis Bamidis (Lab of Medical Physics and Digital Innovation Aristotle University of Thessaloniki, School of Medicine, Greece) Enabling Reproducible AI Research Through the RAISE Platform: A Cross-Dataset Study on Emergency Department Triage ABSTRACT. Reproducibility and cross-dataset generalization remain critical barriers to deploying AI systems in clinical practice. We report a cross-institutional case study that explores whether heterogeneous emergency department (ED) triage datasets can be meaningfully compared and selectively harmonized. Two datasets were analyzed: one from Sweden focusing on chronic kidney disease (CKD) patients, and another from Greece including patients presenting to the emergency department with cardiovascular-related chief complaints. Using the Research Analysis Identifier System (RAISE) platform, we executed locally, identical analytical pipelines at both sites without exchanging patient-level data. Our analysis identified a core set of variables, including age, sex, vital signs, pain scores, and admission outcomes, that were conceptually equivalent across institutions and could be harmonized through standard preprocessing, while symptom encodings and laboratory values required site-specific handling. We propose a concept-level harmonization framework that categorizes variables by their alignment requirements rather than forcing uniform representations. This work enabled researchers from two distinct healthcare institutions in different countries to collaborate under strict data-governance constraints and provides practical guidance for researchers initiating multi-site collaborations with structurally diverse clinical data, particularly in acute care settings where differences in documentation practices and regulatory requirements make full harmonization unrealistic.
16:40	Despoina Petsani (Aristotle University of Thessaloniki, Greece) Nikos Athanasopoulos (Aristotle University of Thessaloniki, Greece) Konstantina Tsimpita (Aristotle University of Thessaloniki, Greece) Dorra Rakia Allegue (McGill University, Canada) Sara Ahmed (McGill University, Canada) Eva Kehayia (McGill University, Canada) Michael Doumas (Aristotle University of Thessaloniki, Greece) Panagiotis Bamidis (Aristotle University of Thessaloniki, Greece) Evdokimos Konstantinidis (Aristotle University of Thessaloniki, Greece) Wearable data harmonization and quality in Transitional Care monitoring ABSTRACT. This paper aims to test and validate harmonized infrastructures, data elements and data collection processes that have been used for transnational collaboration in real-life health care environments for research in transitional care between Greece and Canada.
16:55	Nikos Melanitis (AINIGMA Technologies, Greece) Alexandra Anagnostopoulou (Aristotle University Thessaloniki, Greece) Miguel Bhagubai (KU Leuven, Belgium) Maarten De Vos (KU Leuven, Belgium) Konstantina Tsimpita (Aristotle University Thessaloniki, Greece) Panagiotis Bamidis (Aristotle University Thessaloniki, Greece) Harris Styliadis (Aristotle University of Thessaloniki, Greece) Christos Chatzichristos (KU Leuven; AINIGMA Technologies, Belgium) Cloud-Based EEG Analysis via the RAISE Platform ABSTRACT. Collaborative analysis of biomedical signals is increasingly constrained by privacy regulations, and data access restrictions. In this paper, we investigate how collaborative electroencephalography (EEG) analysis can be enabled without transferring sensitive data, while preserving data sovereignty, intellectual property, and scientific transparency, and ensuring reproducibility and provenance of results. We demonstrate the benefits of performing analysis on the RAISE platform, an infrastructure designed to execute computation directly at the data source. We present two complementary use cases: the first on artifact removal on standard EEGs in Multiple Sclerosis and the second on preprocessing and AI-enabled seizure detection on a low-electrode EEG setting. In the first use-case, sensitive clinical EEG data remains fully private at the data owner’s site, while proprietary analysis code is executed remotely without disclosure, illustrating privacy-preserving and IP-protective collaboration. In the second use case, the analysis code is openly shared and executed on a distributed dataset, enabling transparent validation, reproducibility, and traceable provenance through persistent identifiers. Across both scenarios, EEG analysis pipelines were successfully deployed and executed within the RAISE infrastructure, demonstrating that complex, computationally demanding signal processing workflows can be carried out without direct data exchange. Full traceability of analyses is achieved via resolvable Research Analysis Identifiers. “Code-to-data” paradigms such as RAISE provide a viable and scalable solution for collaborative biomedical signal analysis, reconciling privacy protection with FAIR-compliant reproducibility and establishing a robust foundation for future multi-site research.
17:10	Thomas Dimos (Centre for Research and Technology Hellas – Hellenic Institute of Transport (H.I.T.), Greece) Josep Maria Salanova Grau (Centre for Research and Technology Hellas – Hellenic Institute of Transport (H.I.T.), Greece) Despoina Petsani (Lab of Medical Physics and Digital Innovation - Aristotle University of Thessaloniki (AUTH), Greece) Konstantina Tsimpita (Lab of Medical Physics and Digital Innovation - Aristotle University of Thessaloniki (AUTH), Greece) Alexandros Siomos (Centre for Research and Technology Hellas – Hellenic Institute of Transport (H.I.T.), Greece) Achilleas Karakoltzidis (EnvE.X, Greece) Spyros Karakitsios (EnvE.X, Greece) Alberto Gotti (Pavia, Italy, Italy) Dimosthenis Sarigiannis (Environmental Engineering Laboratory - Aristotle University of Thessaloniki (AUTH), Greece) Panagiotis Bamidis (Lab of Medical Physics and Digital Innovation - Aristotle University of Thessaloniki (AUTH), Greece) Evdokimos Konstantinidis (Lab of Medical Physics and Digital Innovation - Aristotle University of Thessaloniki (AUTH), Greece) Cross-Domain Spatial Analysis of Environmental Exposure, Mobility Patterns and Quality-of-Life Indicators in Urban Environments within an Interoperable Research Data Infrastructure ABSTRACT. Urban environments are complex socio-environmental systems where environmental exposure, human mobility patterns and perceived living conditions jointly influence population health. Integrating heterogeneous datasets originating from different domains, however, remains challenging due to fragmentation across data infrastructures and governance frameworks. This study presents a cross-domain spatial analysis combining environmental exposure, mobility-derived population presence and quality-of-life (QoL) indicators within a unified analytical framework. The analysis is conducted within an interoperable research data infrastructure that supports dataset registration, persistent identifiers and traceable analytical workflows, enabling reproducible cross-domain research while respecting data governance constraints. The study focuses on the metropolitan area of Thessaloniki and integrates PM2.5 concentration data representing environmental exposure, aggregated mobility data derived from telecommunications sources describing spatial patterns of population presence, and QoL indicators obtained from structured citizen questionnaires reflecting perceived living conditions. Following spatial harmonisation of the datasets through a common grid-based framework, spatial statistical techniques based on Moran’s I were applied to investigate spatial clustering patterns and cross-domain spatial relationships. The results indicate significant spatial clustering of both population presence and QoL indicators, while PM2.5 concentrations exhibit a largely heterogeneous spatial distribution. Bivariate spatial analysis further reveals a positive spatial association between population presence and air pollution levels, as well as a negative relationship between population presence and perceived quality of life. These findings highlight how urban activity patterns, environmental exposure and perceived well-being interact spatially within the urban environment and demonstrate the potential of interoperable research data infrastructures to support integrated urban health analytics across heterogeneous datasets.

16:00-17:30 Session 7F: Special Track on Multimodal Artificial Intelligence in Healthcare 3: Generative AI and Decision Support

Chairs:

Michela Gravina (University of Naples Federico II, Italy)
Angel Mario Garcia-Pedrero (Universidad Politécnica de Madrid, Spain)

Location: Megaron C

16:00	Massimiliano Mantegna (Università Campus Bio-Medico di Roma, Italy) Elena Mulero Ayllón (Università Campus Bio-Medico di Roma, Italy) Alice Natalina Caragliano (Università Campus Bio-Medico di Roma, Italy) Francesco Di Feola (Umeå University, Sweden) Claudia Tacconi (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Michele Fiore (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Edy Ippolito (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Carlo Greco (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Sara Ramella (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Philippe Claude Cattin (University of Basel, Switzerland) Paolo Soda (Università Campus Bio-Medico di Roma, Italy) Matteo Tortora (University of Genoa, Italy) Valerio Guarrasi (Università Campus Bio-Medico di Roma, Italy) Longitudinal NSCLC Treatment Progression via Multimodal Generative Models PRESENTER: Massimiliano Mantegna ABSTRACT. Predicting tumor evolution during radiotherapy is a clinically critical challenge, particularly when longitudinal changes are driven by both anatomy and treatment. In this work, we introduce a Virtual Treatment (VT) framework that formulates non-small cell lung cancer (NSCLC) progression as a dose-aware multimodal conditional image-to-image translation problem. Given a CT scan, baseline clinical variables, and a specified radiation dose increment, VT aims to synthesize plausible follow-up CT images reflecting treatment-induced anatomical changes. We evaluate the proposed framework on a longitudinal dataset of 222 stage III NSCLC patients, comprising 895 CT scans acquired during radiotherapy under irregular clinical schedules. The generative process is conditioned on delivered dose increments together with demographic and tumor-related clinical variables. Representative GAN-based and diffusion-based models are benchmarked across 2D and 2.5D configurations. Quantitative and qualitative results indicate that diffusion-based models benefit more consistently from multimodal, dose-aware conditioning and produce more stable and anatomically plausible tumor evolution trajectories than GAN-based baselines, supporting the potential of VT as a tool for in-silico treatment monitoring and adaptive radiotherapy research in NSCLC.
16:15	Alice Natalina Caragliano (Università Campus Bio-Medico di Roma, Italy) Giulia Farina (Università Campus Bio-Medico di Roma, Italy) Fatih Aksu (Università Campus Bio-Medico di Roma, Italy) Camillo Maria Caruso (Università Campus Bio-Medico di Roma, Italy) Claudia Tacconi (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Carlo Greco (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Lorenzo Nibid (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Edy Ippolito (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Michele Fiore (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Giuseppe Perrone (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Sara Ramella (Fondazione Policlinico Universitario Campus Bio-Medico, Italy) Paolo Soda (Università Campus Bio-Medico di Roma, Italy) Valerio Guarrasi (Università Campus Bio-Medico di Roma, Italy) Learning from Limited and Incomplete Data: A Multimodal Framework for Predicting Pathological Response in NSCLC ABSTRACT. Major pathological response (pR) following neoadjuvant therapy is a clinically meaningful endpoint in non-small cell lung cancer, strongly associated with improved survival. However, accurate preoperative prediction of pR remains challenging, particularly in real-world clinical settings characterized by limited data availability and incomplete clinical profiles. In this study, we propose a multimodal deep learning framework designed to address these constraints by integrating foundation model-based CT feature extraction with a missing-aware architecture for clinical variables. This approach enables robust learning from small cohorts while explicitly modeling missing clinical information, without relying on conventional imputation strategies. A weighted fusion mechanism is employed to leverage the complementary contributions of imaging and clinical modalities, yielding a multimodal model that consistently outperforms both unimodal imaging and clinical baselines. These findings underscore the added value of integrating heterogeneous data sources and highlight the potential of multimodal, missing-aware systems to support pR prediction under realistic clinical conditions.
16:30	Semanto Mondal (Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy, Italy) Antonino Ferraro (Department of Information Science and Technology, Pegaso University, Naples, Italy, Italy) Martina Iammarino (Department of Information Science and Technology, Pegaso University, Naples, Italy, Italy) Fabiano Pecorelli (Department of Information Science and Technology, Pegaso University, Naples, Italy, Italy) Giuseppe De Pietro (Department of Information Science and Technology, Pegaso University, Naples, Italy, Italy) Multimodal AI for Medical Image Classification: A Comprehensive Analysis of Image and Text Contributions ABSTRACT. Multimodal learning has shown promising directions in medical image analysis by integrating visual and textual information. In this study, we present a systematic analysis of the contributions of image and text modalities for medical image classification using the MedPix 2.0 dataset. We construct a multimodal dataset pairing 2050 clinical images with corresponding textual descriptions and formulate a supervised classification task based on image location categories. A deep learning framework is proposed that combines ResNet50 for image feature extraction with BERT for textual embeddings, followed by feature-level fusion using concatenation. Comparative experiments are conducted with image-only, text-only, and multimodal models. Results demonstrate that multimodal integration improves classification performance, achieving an accuracy of 0.9220 and F1-score of 0.9213, while text alone provides limited discriminative power. This study highlights the complementary role of textual information and provides insights into effective multimodal fusion strategies for medical imaging applications.
16:45	Sushant Gautam (Simula Metropolitan Center for Digital Engineering, Norway) Vajira Thambawita (Simula Metropolitan Center for Digital Engineering, Norway) Michael Riegler (Simula Research Laboratory, Norway) Pål Halvorsen (SimulaMet, Norway) Steven Hicks (SimulaMet, Norway) Beyond the Leaderboard: Design Lessons for Trustworthy Multimodal VQA ABSTRACT. Multimodal AI in healthcare must combine visual and textual evidence while remaining clinically reliable and interpretable. Using MediaEval Medico 2025 as a case study in gastrointestinal endoscopy, we analyze how design choices in vision-language systems affect robust question answering and explanation quality. Across teams, parameter-efficient adaptation of pretrained backbones provides strong baseline performance, but answer-level gains do not consistently translate into faithful and complete clinical reasoning. Methods that enforce structured reasoning and explicit grounding show more reliable behavior across heterogeneous question types. These results motivate multimodal evaluation beyond lexical overlap, standardized evidence-linked explanations, leakage-aware data governance, and lightweight robustness and calibration checks. Overall, the findings support a practical direction for trustworthy multimodal healthcare AI based on data fusion, explainability, and resilient evaluation.
17:00	Miguel Guillermo Abreu Cárdenas (Instituto Tecnológico de Costa Rica, Costa Rica) Saul Calderon Ramirez (Instituto Tecnológico de Costa Rica, Costa Rica) Marisse Masis Solano (Iriscience Tech, United States) Angel Garcia Pedrero (Universidad Politecnica de Madrid, Spain) Kanav Sharma (DePaul University, United States) Karla Duenas Angeles (Instituto de Oftalmologia Fundaci´on Conde de Valenciana IAP, Mexico) Ernesto Gonzalez Papa (Instituto de Oftalmologia Fundaci´on Conde de Valenciana IAP, Mexico) Emmanuel Rojas Soto (Instituto Tecnológico de Costa Rica, Costa Rica) Michela Gravina (University of Naples Federico II, Italy) Carlo Sansone (University of Naples Federico II, Italy) An Evaluation of MedGemma for Anterior Segment Eye Images Classification and Analysis ABSTRACT. Cataract grading via Lens Opacities Classification System III (LOCS III) is clinically essential yet prone to inter-observer variability. This study evaluates MedGemma, a biomedical Visual-Language Model (VLM), using zero-shot inference on anterior segment images to predict LOCS III grades and generate clinical prose, benchmarked against expert annotations. To address the “Evaluation Gap,” we implemented a multidimensional framework of lexical (Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation – Longest Common Subsequence (ROUGE-L)) and semantic (Bidirectional Encoder Representations from Transformers Score (BERTScore), Sentence-BERT (sBERT)) metrics, featuring a cross-validation baseline of mutually exclusive diagnoses. While quantitative results indicated high semantic similarity (BERTScore: 0.848, sBERT: 0.829) despite low lexical overlap (ROUGE-L: 0.128), our baseline proved that semantic metrics artificially inflate scores by rewarding shared medical jargon over diagnostic accuracy. Qualitative audits further exposed hallucinations and spatial attention biases. Ultimately, high semantic scores do not ensure clinical precision, validating the strict necessity of task-specific fine-tuning for autonomous ophthalmic inference.
17:15	Alba Lozano (Dept. of Computer Systems Architecture and Technology. Universidad Politécnica de Madrid, Spain) Consuelo Gonzalo-Martin (Dept. of Computer Systems Architecture and Technology. Universidad Politécnica de Madrid, Spain) Angel Garcia-Pedrero (Dept. of Computer Systems Architecture and Technology. Universidad Politécnica de Madrid, Spain) A Modular Platform for Multimodal Clinical Research: Architecture and Case Study on Glaucoma Classification ABSTRACT. Multimodal vision–language models (VLMs) have shown strong potential for supporting clinical decision-making in ophthalmology. However, evaluating these models requires robust tools capable of integrating heterogeneous data sources, preprocessing pipelines, experiment tracking, and reproducible workflows. This work presents a modular research platform designed to streamline multimodal experimentation in medical imaging. The system provides a unified interface for dataset standardization, multimodal preprocessing, model interaction, and evaluation. To demonstrate its applicability, we include a condensed case study focused on glaucoma classification using retinal fundus images and structured clinical metadata. This platform simplifies experimentation while maintaining reproducibility and offering extensibility for future clinical research.

18:30-20:00 Welcome Cocktail

Open bar with hot and cold items and sweets.

Location: Hotel Gardens