ICCS 2026: 26TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE
PROGRAM FOR WEDNESDAY, JULY 1ST
Days:
previous day
all days

View: session overviewtalk overview

09:00-09:50 Session 12: Keynote Lecture 5
09:00
Balancing Energy, Resource Utilization, and Performance in HPC System Operations

ABSTRACT. Operating modern high-performance computing (HPC) systems efficiently requires balancing several competing objectives: minimizing energy consumption, maximizing resource utilization, maintaining high system throughput, and ensuring acceptable response times. These optimization goals often conflict, making it necessary to adapt system operation according to site-specific priorities and workload characteristics. This talk presents an integrated software approach that combines comprehensive monitoring, AI-driven workload analytics, and dynamic scheduling to improve overall system efficiency. The SEANERGYS software aims to support production HPC environments while enabling more efficient use of available energy and compute resources. A monitoring infrastructure collects data from hardware and software sensors and correlates them with scheduler information to identify inefficiencies such as underutilized resources. Machine-learning models analyze historical and real-time operational data to characterize workload behavior, predict job resource demands, and identify complementary workloads suitable for co-scheduling. These insights inform dynamic resource management and scheduling policies that adapt system operation to improve energy efficiency and utilization while maintaining performance targets.

09:50-10:20Coffee Break
10:20-12:00 Session 13A: CompPet 1
10:20
RACCOONS: Data Management Library for High-Performance Astrometric Reduction Pipelines

ABSTRACT. An astrometric reduction pipeline is a sequence of steps for calculating spatial positions and velocities of stars from their angular positions at different times on the celestial sphere as measured by telescopes. The accuracy of the results depends on the number of observations per star, because even the finest telescopes are afflicted with various noises. To reach scientifically valuable accuracy, the total number of observations undergoing reduction exceeds billions, demanding highly efficient Big Data management tools to support calculations. In this work, we introduce the RACCOONS library, which provides high-performance lock-free methods for handling immutable and mutable data. It is written in C++, has a Python binding, and implements efficient mechanisms for querying the observational data of a certain astrometric mission, calculating statistical properties of requested datasets, and acting as a large in-memory storage for mutable data by joining together the RAM of several cluster nodes, utilising the remote direct memory access. This library is currently reused in three software systems within the astrometric reduction pipeline for the Japan Astrometry Satellite Mission for Infrared Exploration (JASMINE): direct and iterative astrometric solvers, and a solution visual analytics system. At the current research and development step, we tested RACCOONS and the corresponding software on relatively small CPU clusters with 0.2 PFLOPS capacity and proved good scalability of both calculations and data treatment. The next step for future work is to deploy RACCOONS-powered software on a 2 PFLOPS cluster.

10:40
Bridging Grid and HPC Computing for Data-Intensive Science: Scaling dCache for Modern Workflows

ABSTRACT. The increasing integration of high-performance computing (HPC) resources into data-intensive scientific workflows places new demands on storage systems traditionally developed for grid and distributed computing environments. dCache, a mature, exascale storage system jointly developed by Deutsches Elektronen-Synchrotron (DESY), Fermi National Accelerator Laboratory, and the Nordic e-Infrastructure Collaboration (NeIC), has evolved to support a broad range of scientific communities beyond its origins in High-Energy Physics, including astrophysics, Photon science, and AI training. These communities increasingly rely on HPC systems for large-scale data analysis, exposing scalability and performance challenges at the metadata and data access layers.

This paper presents recent development efforts aimed at making dCache more HPC-friendly in interdisciplinary scientific environments. By aligning dCache’s architecture and development practices more closely with HPC requirements, we enable transparent access to shared data infrastructures from HPC clusters while preserving dCache’s strengths in data management, federation, and long-term preservation. We focus on optimizing metadata access to improve scalability and reduce latency under highly parallel workloads, as well as on substantial enhancements to dCache’s NFSv4.1/pNFS implementation. These include improved pNFS layout handling, read delegation, and zero-copy data paths to reduce CPU overhead and memory copies, resulting in significantly improved I/O performance for HPC applications. We evaluate these enhancements using representative HPC access patterns and their impact on throughput, latency, and system scalability. The benchmark results demonstrate that the proposed changes can improve current workflows by an order of magnitude, depending on the workflow in use.

11:00
Fast data processing for X-ray crystallography

ABSTRACT. For many types of experiment at large-scale X-ray facilities, data processing has traditionally been seen as a compute-intensive and time-consuming process. For one experimental method, serial X-ray crystallography, we have implemented a system for fully real-time processing of data streamed directly from the detector, with no intermediate disk storage.

Based on the CrystFEL software for serial crystallography [1] in combination with the ASAP::O high-performance data framework developed at DESY [2], the system is capable of processing more than 1000 frames per second using only a single compute node. It has been deployed at the synchrotron light source PETRA III, where it has been reliably used in a series of user experiments [3], and similar systems based on the same building blocks are now being tested at other experimental stations.

To make real-time data processing a reality, we needed to increase the speed of the algorithms. This was achieved largely by finding and removing the largest bottlenecks in the software, combined with avoiding configuration errors which unnecessarily reduced performance. These straightforward measures already led to a speed-up of around 50 times compared to the situation at the beginning of the project. Further speedups have since been made by measures such as addressing lock contention and reducing memory allocations.

A prerequisite for real-time processing is that the measurement system is properly calibrated before the experiment. For X-ray crystallography, the calibration information includes a model of the position and orientation of the detector in space, relative to the interaction point between the X-ray beam and the sample. The methods so far employed for determining this geometry have been too slow to run very frequently. However, an algorithm in particle physics exists to solve an analagous problem with much higher performance [4]. This algorithm, named Millepede, has been successfully transferred from particle physics to X-ray crystallography [5]. Even when fitting across tens or hundreds of thousands of diffraction patterns, the Millepede method calculates an updated detector model in a negligible amount of time, allowing frequent calibration updates to be a part of our high-speed data processing system.

Real-time data processing offers many advantages. First, there is the obvious improvement in “situational awareness” during the experiment: the ability to spot problems, make improvements and know when enough data has been collected. In addition, since there is no technical need to store the detector readout data on disk, there is potential for drastic reductions in the high data storage costs which are currently associated with serial crystallography experiments. As well as describing the technical achievements, this contribution will discuss the greater implications of real-time data processing on how we perform experiments at large-scale X-ray facilities.

[1] T. A. White, R. A. Kirian, A. V. Martin, A. Aquila et al. J. Appl. Cryst. 45 (2012) p335. [2] https://asapo.pages.desy.de/asapo/ [3] T. White, T. Schoof, S. Yakubov, A. Tolstikova, et al. IUCrJ 12 (2025) p97. doi:10.1107/S2052252524011837 [4] V. Blobel, Nuclear Instruments and Methods in Physics Research A 566 (2006) p5. doi:10.1016/j.nima.2006.05.157 [5] T. A. White, J. Appl. Cryst. (2026), in press.

11:20
Data Management and I/O Provisioning Across Cloud-Edge Continuum for High-performance Computational Data Pipelines

ABSTRACT. The increasing adoption of cloud, edge, and Internet of Things (IoT) technologies has led to the emergence of the cloud–edge compute continuum, enabling low-latency, data-intensive applications across highly distributed infrastructures. However, designing and operating high-performance data pipelines in such environments remains challenging due to heterogeneous resources, distributed data sources, stringent security requirements, and the need for efficient data movement and I/O provisioning. This paper presents an architecture for data management and I/O provisioning that supports the execution of high-performance data pipelines across the cloud–edge continuum. The proposed approach combines federated data management, intelligent data placement, and continuum-aware resource orchestration within a unified platform. These capabilities are integrated with pipeline runtime services that enable adaptive execution and optimization based on observed data access patterns and resource availability. The proposed architecture is evaluated using a real industrial data pipeline scenario from fiber laser cutting domain.

11:40
ACTS: Cross-experiment charged particle reconstruction toolkit

ABSTRACT. High-energy physics collider experiments analyze the products of particle collisions, so-called events, using large-scale detectors composed of specialized subsystems. At the core of most experiments is a tracking detector responsible for reconstructing the trajectories of charged particles.

At the Large Hadron Collider (LHC) at CERN, experiments produce vast amounts of raw data that must be processed and enriched for physics analyses. Charged particle reconstruction must handle hundreds of thousands of measurements per event, with millions of collisions occurring per second and thousands recorded for offline processing. At the High-Luminosity LHC, output rates of tens of gigabytes per second will further increase computational demands. Precise tracking is essential to distinguish rare collisions of interest from background interactions that can occur only micrometers apart.

This contribution presents A Common Tracking Software (ACTS), a modular and reusable toolkit for building charged particle reconstruction pipelines. Originally developed for LHC experiments, ACTS is now used across multiple experiments and has become a key component in modern track reconstruction systems. In addition to a baseline CPU implementation, ACTS provides a GPU-optimized tracking pipeline built from the same algorithmic concepts, enabling flexible hybrid CPU–GPU pipelines and significant acceleration of computationally intensive stages.

ACTS supports the full reconstruction chain, including detector description, numerical integration, pattern recognition, track finding and fitting, and primary vertex reconstruction.

We describe the overall architecture, design principles, and capabilities of the software, and highlight major recent developments for both CPU and GPU environments. Finally, we present results from integration within the ATLAS

10:20-12:00 Session 13B: CV-LNM 1
10:20
ViT-SpSH: A Hybrid Transformer-Spectral Head Architecture for Chromatic Aberration Detection and Localization

ABSTRACT. Chromatic aberration remains a common optical artefact in digital photography, manifesting as color fringing along high-contrast edges due to wavelength-dependent lens refraction. This paper proposes a hybrid vision transformer-based architecture for the accurate detection and localization of chromatic aberration in single RGB images. The method employs a pretrained ViT encoder to extract global contextual patch embeddings capturing long-range spatial dependencies. In parallel, chromatic residual maps are computed from inter-channel differences to explicitly highlight spectral misalignments. These residuals are fused early with ViT embeddings, followed by cross-attention refinement and convolutional upsampling in a dedicated spectral segmentation head, yielding a high-resolution probability map of aberration regions. Experiments on a diverse dataset of real-world photographs demonstrate that the proposed hybrid approach significantly outperforms classical convolutional baselines (classic CNN, U-Net, FCN-VGG) in classification accuracy, offering a robust tool for automated optical quality assessment.

10:40
Spatially refined transformer embeddings for accurate histopathology tissue classification

ABSTRACT. Accurate recognition of tissue types in histopathological whole- slide images is a fundamental challenge in computational pathology, particularly given the global decline in practicing histopathologists. We present a classification framework that integrates a lightweight Vision Transformer (TinyViT) for patch-level feature extraction with dimen- sionality reduction via Principal Component Analysis and Neighborhood Component Analysis. The resulting low-dimensional yet discriminative representations are classified using k-Nearest Neighbors and Support Vector Machines, yielding state-of-the-art performance on the DiagSet prostate cancer dataset and on BreakHis, even under extreme reduc- tions in feature dimensionality. To further improve robustness, we intro- duce a spatial refinement strategy that projects patch predictions into a grid representation of the slide, enforcing spatial consistency by identi- fying and reclassifying low- and high-confidence regions. This two-stage process enhances predictive accuracy and improves interpretability by highlighting confident as well as uncertain tissue areas.

11:00
Mask Cross-Attention Transformer for Robust Exercise Recognition Under Execution-Speed Domain Shift

ABSTRACT. Video-based exercise recognition faces a critical challenge when deployment conditions differ from training environments: execution speed variation introduces temporal domain shifts that degrade model performance. We propose the Mask Cross-Attention Transformer, a dual-stream architecture that conditions temporal reasoning on human-centric spatial priors through cross-attention between appearance features and per-frame human masks. By decoupling semantic motion patterns from execution tempo, the model achieves robust recognition across diverse scenarios. On the public MM-Fit benchmark, the model achieves test ac-curacy of 95.0% and macro-F1 of 88.8% on 11 exercise classes, surpassing recent multimodal approaches while using only grayscale video with automatically generated masks. On a private dataset under execution-speed domain shift (where training uses slow executions and testing uses fast movements), the model achieves 85.5% accuracy and 81.5% macro-F1 across 17 exercise classes, outperforming temporal transformers with concatenation (76.5% accuracy, 66.5% macro-F1) and appearance-only variants (65.2% accuracy, 51.6% macro-F1).

11:20
Hyper-DINO: Efficient Hyperbolic Embeddings for Histopathological Content-Based Image Retrieval

ABSTRACT. In this paper, we propose new hyperbolic embeddings, Hyper-DINO, for use in histopathological content-based image retrieval. First, we used a DINO pretrained ViT Small vision transformer to extract the base features. The resulting [CLS] token is dimensionally reduced using PCA or NCA techniques. These features are then mapped onto a hyperbolic space, so that features from positive pairs are closer together, while features from negative pairs are further apart. For this mapping, we use a specially designed Hyperbolic Mapper model with Hyperbolic Contrastive Loss function. This approach improved MAP@20 results on the Kather and BreakHis histopathology datasets at separate magnifications of 40X, 100X, 200X, and 400X, respectively, to 94.61%, 86.64%, 79.15%, 78.91%, and 75.17%. An additional advantage of the Hyper-DINO features is their compact size. The representation of a single image is a vector of 32 float16 numbers, so it takes up only 64 bytes.

11:40
View-Independent 3D Gait Recognition Using Sequence-Based Siamese Networks

ABSTRACT. Gait Recognition is a rapidly evolving area of computer vision research. In recent years, particular attention has been devoted to approaches based on multi-camera datasets and the three-dimensional data derived from them. However, a notable research gap persists between person identification results obtained from precise motion capture data and those achieved using marker-less approaches. The Synchronized GPJ-ATK dataset employed in this study enables addressing this issue by validating methods that utilize approximated joint positions obtained through linear triangulation against ground-truth motion capture data. This work presents a sequence-based Siamese method for view-independent 3D gait recognition tailored to this type of dataset. The proposed approach achieves a Rank-1 classification accuracy of 90.49% on triangulated data and 94.26% when evaluated using ground-truth motion capture data.

12:00
Mind the Gap: Quantifying the Domain Gap in Cross-Sensor Diffusion Super-Resolution

ABSTRACT. Demand for high-resolution satellite imagery has increased interest in super-resolution (SR) to bridge the spatial resolution gap between freely available missions such as Sentinel-2 and commercial systems like PlanetScope. Because no sensor provides true paired low- and high-resolution observations, SR models are usually trained on synthetically degraded data, creating a domain gap on real cross-sensor imagery. In this work, we provide the first systematic study of how this synthetic-to-real mismatch affects the performance of modern diffusion-based SR models. Using a large, geometrically and temporally aligned dataset of Sentinel-2 and PlanetScope imagery. We evaluate five state-of-the-art diffusion architectures under controlled experimental settings. We also introduce \textit{LPIPS\textsubscript{Sat}}, a domain-adapted perceptual metric based on Sentinel-2 self-supervised features. Our results show two persistent challenges: synthetically trained models degrade sharply on real pairs, while models trained on real cross-sensor data exhibit optimisation difficulties and struggle to adapt to the physical and radiometric diversity. These findings highlight a key limitation of current SR and motivate methods that disentangle super-resolution from domain adaptation.

10:20-12:00 Session 13C: MMS 2
10:20
From Formal Specifications to Executable Simulations: A Computation-Driven Metasystem for Agent-Based Modeling

ABSTRACT. Agent-Based Modeling and Simulation (ABMS) has become a widely used approach for analyzing complex systems in multidisciplinary fields such as healthcare and hospital Emergency Departments (EDs). However, the adoption of this methodology is often hampered by monolithic implementations in which domain knowledge is tightly in tertwined with computational logic, limiting the long-term reusability and adaptability of simulation models. Inspired by Lego®’s modularity, this paper presents a metasystem based on a modular architecture centered on an agent metagenerator. The proposed approach conceptually encapsulates the definitions of the agents in brick-style agents, decomposing each agent into six canonical blocks. These blocks are independent of the target programming language, ensuring a clear separation between conceptual specifications defined by domain experts and their computational implementation by engineers and technicians. This separation of concerns facilitates multidisciplinary collaboration by enabling experts to explicitly define agent behavior through standardized specifications. Unlike large-scale data-driven approaches, all agent decisions are explicitly defined and calibrated using a small, controlled dataset, preserving transparency and traceability between the conceptual model and the resulting computational behavior. The proposed metasystem is validated through a proof-of-concept implementation using a simplified ED case study. The results demonstrate that the architecture prevents re-monolithization while enhancing modularity, providing a solid foundation for reusable, traceable, and scientifically grounded ABMS.

10:40
The Road to Flee 4: Food Security Coupling Revisited

ABSTRACT. Flee is an agent-based model used for conflict-driven population displacement, predicting where displaced persons may go when a conflict breaks out. It has been in use since inception in 2016, and has more recently been used in collaborations with multiple universities, as well as NGOs such as Save the Children and World Watch Research. Throughout the years the code has been coupled to other models in different ways, ranging from monolithic to file coupling and MUSCLE3-based coupling. Since the release of Flee 3 in 2024, we have received an increasing number of requests for integrations with new decision-making approaches.

In this talk, I will briefly review the coupling approaches used in the past and reflect on these emerging features and couplings, and will highlight one coupling aspect that we first introduced in 2019 and have now revisited: food security coupling. Using this example, and others, I provide a perspective on how we intend to design Flee 4: which we envision will be the first version to support custom sub-models.

11:00
Transport Network Topology as a Determinant of Forced Displacement Dynamics: Case Study of Rail-Based Evacuation from Ukraine

ABSTRACT. Conflict-induced displacement is strongly shaped by trans- portation networks, yet most displacement models represent mobility as continuous or distance-based, neglecting network constraints such as corridor structure, connectivity, and capacity. This limits the ability of existing approaches to capture evacuation dynamics in situations where movement is mediated by infrastructure-specific determinants. This paper presents an agent-based model of infrastructure-constrained forced displacement that explicitly represents a real-world railway net- work. Implemented in Flee 3, the model simulates displacement during the early phase of the 2022 Russian invasion of Ukraine by dynami- cally generating agents in response to conflict-onset data from ACLED and routing movement exclusively along operational rail corridors toward railway-accessible border checkpoints. Simulations cover the period from February to June 2022, producing daily arrival counts at eight major border crossings for validation against a curated empirical dataset. The model achieves strong overall fit, with a median across-camp av- erage relative difference (ARD) of 0.31 and sustained post-surge errors below 20% for most corridors. Beyond aggregate fit, the results reveal a clear two-phase structural transition. During the initial surge, flows are distributed across multiple corridors, with no single route consistently dominant. In the stabilized phase, flows consolidate sharply indicating persistent single-corridor dominance. Ultimately, the findings suggest dis- placement trajectories are not governed by distance or conflict intensity alone, but by network topology, asymmetric capacity constraints, and endogenous corridor competition.

11:20
Satellite-based Conflict Damage Detection: Siamese CNNs for Forced Displacement Planning in Ukraine

ABSTRACT. Forced displacement due to armed conflict is an escalating global challenge, with over 117 million people displaced worldwide in 2025, creating urgent humanitarian crises that demand rapid and accurate response. Humanitarian actors responsible for managing both outward and inward displacement movements operate under severe resource constraints, limiting their capacity to conduct large-scale and timely assessments. While, traditional damage assessment methods, which rely on ground-based surveys and expert-driven visual interpretation of available data sources, are time-consuming, resource-intensive, and often impractical in active conflict zones or immediately following catastrophic events. Hence, effective planning of population displacement requires timely and accurate identification of conflict-damaged areas.

To address this, we present a Siamese Convolutional Neural Network model for binary building damage detection using medium-resolution Sentinel-2 satellite imagery, a freely and openly available data source. Using pre- and post-conflict image pairs from the Copernicus Data Space Ecosystem and building damage annotations from UNOSAT, we design and evaluate eight model variants combining five backbone architectures (ResNet-50, ResNet-101, ResNet-152, VGG-16, and Vision Transformer) with two classification heads (Global Average Pooling and Multilayer Perceptron). Experiments are conducted across seven conflict-affected Ukrainian regions.

Our best-performing model, a Siamese ResNet-101 with an MLP head, achieves an Area Under the ROC Curve of 0.911 and an Average Precision of 0.888, demonstrating strong detection capability without high-resolution commercial imagery. We analyse the model's utility for both outward and inward humanitarian displacement planning and discuss its potential integration with agent-based models for simulating forced displacement scenarios. Our results demonstrate that resource-efficient, publicly available satellite data can support data-driven humanitarian planning at scale.

11:40
Credible Agent-Based Modeling of Crisis Evacuation: A Dual-Process Cognitive Architecture

ABSTRACT. Agent-based models of crisis evacuation typically assume uniform decision-making across agents and contexts; every agent weighs options the same way regardless of whether they are fleeing under active threat or recovering in a safer setting. This assumption misrepresents how human cognition operates under stress. We extend the Flee agent-based simulation framework with a dual-process cognitive architecture grounded in Kahneman’s System 1/System 2 theory, allowing agents to adaptively switch between fast heuristic (System 1) and deliberative (System 2) cognitive modes as a function of environmental conditions and network structure.

Standard evacuation models treat decision-making as a fixed process. In reality, people in crisis sometimes rely on rapid automatic judgments (e.g., following the crowd, fleeing away from the source of danger) and, at other times, engage in effortful deliberation, considering different routes, coordinating with family, or seeking information. Critically, System 2 deliberation is not always “better”: it requires cognitive resources that may be unavailable under acute stress, and it takes time that crisis conditions may not permit. Capturing this context-dependence is essential for models that aim to predict not just where people go, but how decision quality varies across populations and conditions. Moreover, as Kahneman extensively documents, System 2 reasoning is itself subject to a wide range of cognitive biases and can be lazy in its analysis, satisficing rather than genuinely scrutinizing. The present model captures the availability of deliberation rather than its quality, treating System 2 activation as a necessary but not sufficient condition for good decisions, a deliberate simplification that future work should relax.

The Framework We formalize the probability of deliberative (System 2) processing, Pₛ₂, as the product of two necessary conditions: cognitive capacity Ψ, the agent’s ability to deliberate grounded in experience, and structural opportunity Ω, defined as whether the situation permits deliberation. Either condition alone is insufficient; both are required. This necessary-conditions logic motivates the multiplicative form: Pₛ₂ = pₛ₂ × Ψ(x; α) × Ω(c; β) where Ψ(x; α) = 1 / (1 + e−αx) and Ω(c; β) = 1 / (1 + e−β(1−c)). Here, x = experience index; c ∈ [0,1] = conflict intensity; α, β > 0 control sensitivity; pₛ₂ = baseline activation probability.

The multiplicative structure serves as an explicit scale-bridging mechanism: individual cognitive states (parameterized at the agent level by Ψ) couple to the network-level structural opportunity Ω to produce emergent population dynamics. This is a genuine multiscale interaction: micro-level cognition and meso-level infrastructure jointly determine macro-level evacuation outcomes in ways neither scale alone can predict.

Uncertainty Quantification and Credibility Credibility is established through variance-based global sensitivity analysis using Sobol indices (Sobol, 1993; Saltelli et al., 2008), implemented via the EasyVVUQ framework. Sobol indices decompose total output variance into contributions from individual parameters and their interactions, distinguishing parameters that drive outcomes independently from those whose influence depends on the values of other parameters. These parameters allow us to assess not just which parameters matter, but how and through what pathways.

Key Findings Early results reveal a clean separation of mechanisms across stylized network topologies. The opportunity parameter β appears to dominate System 2 activation rates, while network topology dominates evacuation outcomes. The cognitive parameters α and β exhibit limited independent effects on evacuation rate, whereas the baseline deliberation probability pₛ₂ shows a substantial interaction with topology, indicating that its influence on outcomes depends on network structure. This preliminary finding points to a clean causal separation: β governs whether agents deliberate, while topology governs whether deliberation translates into successful evacuation.

Our work demonstrates how established cognitive theory can be embedded in agent-based simulation in a parsimonious, falsifiable form amenable to formal uncertainty quantification. With only three parameters for the cognitive component, the model generates structured, mechanistically interpretable population dynamics. Explanatory power is maximized while model complexity is kept to a minimum.

References Kahneman, D. (2003). Maps of bounded rationality: Psychology for behavioral economics. American Economic Review, 93(5), 1449–1475. Sobol, I.M. (1993). Sensitivity analysis for nonlinear mathematical models. Mathematical Modelling and Computational Experiments, 1(4), 407–414. Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., & Tarantola, S. (2008). Global Sensitivity Analysis: The Primer. Wiley.

10:20-12:00 Session 13D: MCDM 1
10:20
Attribute Importance in Conflict Models: Using Clustering Metrics and Bi‑Coalitions for Issue Evaluation

ABSTRACT. This paper presents a new approach to evaluate attribute importance in conflict situations, designed to support negotiation processes. Building on Pawlak's conflict model, the study introduces a bi-coalition-based measure that identifies issues on which groups of agents can reach unanimous agreement and examines how these issues influence coalition formation. This new qualitative measure is compared with an existing quantitative approach based on clustering metrics, showing that the two methods reveal different dimensions of conflict dynamics. The framework is illustrated using the Russia-Ukraine conflict, where territorial integrity emerges as the central dividing issue. The results demonstrate that the proposed bi-coalition‑based measure helps identify issues that make agreement more difficult, highlight those that offer greater potential for consensus, and support the development of more effective strategies for dispute mediation.

10:40
Simulating decision-makers' behaviour in risk management problems using prospect theory

ABSTRACT. Risk management plays a crucial role in multi-criteria decision analysis, as decision-makers must balance the attractiveness of alternatives with the uncertainty and risk associated with their selection. Behavioural aspects such as sensitivity to gains and losses and loss aversion significantly influence how risk is perceived and processed, yet their impact on the stability of decision-support outcomes remains insufficiently explored. This study investigates the robustness of recommendations generated by the Risk-Informed Decision-Making (RIDM) method, which extends traditional Multi-Criteria Decision Analysis (MCDA) by incorporating prospect-theory-based modelling of risk attitudes. A large-scale simulation framework is proposed to model decision-makers’ behaviour through systematic modifications of prospect-theory parameters representing gain sensitivity, loss sensitivity, and loss aversion. Three complementary experimental approaches are considered: directional behavioural profiles reflecting rational, emotional, and asymmetric attitudes; unequal responsiveness of risk parameters; and isolated single-parameter modifications. The experiments examine how changes in risk perception affect local ranking stability, measured as one-position shifts in the ordering of alternatives. The results indicate that asymmetric and gain-oriented behaviours lead to less stable rankings, requiring smaller behavioural changes to alter outcomes, while more balanced profiles exhibit higher robustness. These findings provide behavioural insight into the sensitivity of RIDM-based recommendations and support their interpretation in risk-aware decision-support applications.

11:00
When Decisions Are Blocked: Inhibitory Rules Induced from Tree Ensembles

ABSTRACT. Decision trees and rule-based systems are widely used in various data mining tasks. Their great advantage is the ability to describe decision-making processes in a transparent manner. Unlike standard decision rules “if-then”, inhibitory rules have in their successor form: $“attribute\not equal decision”$. In the paper, two types of inhibitory rules have been defined: inner and general. Inner inhibitory rules are derived from complete paths in individual decision trees (path from the root to leaf of the tree), whereas general inhibitory rules are constructed using any attributes present across the trees. This work explores the problem of extracting both inner and general inhibitory rules that are valid for the largest possible number of decision trees within a given set. We introduce a polynomial-time algorithm to solve the inner inhibitory rule optimization problem, establish that optimizing general inhibitory rules is NP-hard, and propose a heuristic approach to approximate its solution. Experimental evaluations are conducted using synthetically generated decision tree sets to compare the performance of the proposed algorithms, taking into account the number of trees for which the constructed rules are true, and their length.

11:20
Computing Wasserstein Distances Between Asymmetric Interval Numbers

ABSTRACT. Asymmetric interval numbers (AINs) extend classical intervals by incorporating an expected value. Each AIN is associated with a canonical probabilistic representation in the form of a piecewise-constant approximating distribution composed of two uniform distributions, uniquely determined so that the normalization condition holds, reflecting the distributional asymmetry. However, classical interval distances, such as the Hausdorff distance, depend only on interval endpoints and assign distance zero to AINs sharing the same support but differing in expected value. In this paper, we derive closed-form analytical formulas for the Wasserstein distances $W_1$, $W_2$, and $W_\infty$ between AINs, exploiting the piecewise-linear structure of the quantile functions of the associated distributions. All three formulas are evaluated in constant time $\mathcal{O}(1)$, eliminating the need for numerical integration $\mathcal{O}(n)$. We prove that the proposed distances are proper metrics and establish their key structural properties, including translation invariance, positive homogeneity, continuity, and ordering $W_1 \leq W_2 \leq W_\infty$. We further demonstrate that these distances are sensitive to distributional asymmetry, making them strictly more informative than classical interval metrics.

10:20-12:00 Session 13E: ComPsy 1
10:20
An Agent Based Model of Effects of Sleep Deprivation on Suicidality

ABSTRACT. Computational modelling is increasingly used as a tool to understand psychological phenomena, including suicidal behaviour and outcomes. However, the use of Agent Based Models (ABMs) in suicide research is still nascent. The goal of this project is to extend a formal computational model of suicide with social and state-dependent dynamics enabled by ABM architecture, and to evaluate their effects on suicidal thought within the extended model. The model proposed in this project is an ABM implementation of the General Escape Theory of Suicide, combined with aspects of the Interpersonal Theory of Suicide on the social effects of burdensomeness and connectedness. The model is based on a system of differential equations, as proposed by Wang et al. and extended by Engels. The prevalence of suicidal thought and aversive internal state was compared between a baseline model without state-dependent effects, a model with the effect of sleep deprivation, and a model with all new additions including social effects. Sleep deprivation was found to significantly increase mean suicidal thought and aversive internal state experienced. The addition of social dynamics reduced suicidal thought and aversive internal state in the proposed model.

10:40
Machine learning to predict mental health problems in early childhood using behavioral screening questionnaires

ABSTRACT. Introduction. Early detection of mental health problems in children is crucial for outcome later in life [1]. Here, we aimed to predict problem behavior, as a proxy for mental health, in 3-year-old children and identify potential risk factors from self- and parent-report questionnaire data on maternal mental health and early child development using multiple (explainable) machine learning (ML) approaches. Methods. Our data consisted of 638 mother-child pairs from the YOUth study, a population-based longitudinal cohort. The predicted outcome was the score on the internalizing and externalizing scale of emotional and behavioral problems of the Child Behavior Checklist (CBCL/1½–5), measured at 3.7±0.8 years. Predictors included behavioral screening questionnaire data that assessed maternal mental health during pregnancy (Adult Self Report (ASR)) and social-emotional development of the child at 5 and 10 months (ASQ:SE-2), as well as socio-demographic predictors (age of child, sex of child (52.7% female), birth weight, household income, maternal education), known to affect mental health care use [2]. The ASR for maternal mental health was operationalized in three ways: (1) a total score (ASRtotal) (2) 100 individual item scores (ASRitem), and (3) 12 subscale scores (ASRsubscale). We trained five distinct multioutput ML models, i.e., linear regression, elastic net, decision tree, random forest and multi-layer perceptron (MLP) on separately ASRtotal, ASRitem, and ASRsubscale together with the other predictors. MLP and tree-like methods were expected to perform well on ASRitem, as this kind of ML can maximize predictive power with detailed data. However, due to our relatively small sample size, earlier research suggests that elastic net regression with ASRsubscale would the best fit [3]. Training for each model was done on 80% of data and tuning was done for each format of ASR data using Bayesian optimization. We compared mean squared error (MSE) between models on joint performance on the internalizing and externalizing scale and examined coefficients and feature importance per outcome scale. Results & Discussion Random forest and elastic net regression had the lowest MSE of all models for ASRtotal, ASRitem, and ASRsubscale on the test set (respectively 38.6, 37.0 and 44.8 for random forest and 40.1, 34.3 and 44.7 for elastic net). Performance for all models was lowest on the ASRitem test set (min. 44.7 for elastic net, max. 58.8 for MLP). This implies the predictors had difficulty with the high predictors-to-sample ratio. This is in contrast with [4], where item-levels showed higher performance for every sample size. Overall performance on the test set amounted to a mean residual of 6.3. Since 6 points can make a difference between a non-clinical and clinical score, performance should be increased before this model can be used in practice. This might be explained by the right skew in the CBCL outcomes, since we work with a population cohort. Resampling outcomes at the tail end of the distribution or supplementing the data with a clinical cohort might make the data more balanced and improve performance. Examining the weights for the explainable methods, we found that social-emotional development at 10 months was an important predictor across conditions. Notably, the score for the development at 5 months was not. For the externalizing scale, attention problem related items and the corresponding subscale showed high feature importance. Otherwise, the importance of subscales and items showed large variability between models, despite the total score having a high weight in the ASRtotal condition. This highlights a possible issue in explainability when choosing item-level quantification of questionnaires. Overall, the results suggest that both internalizing and externalizing behavioral problems around 3 years of age can be predicted based on self- and parent-report questionnaire data collected during pregnancy and infancy. However, further work is needed to improve predictive performance. Among the evaluated approaches, elastic net regression applied to subscale-based representations yielded the strongest predictive performance. Given that we showed that social-emotional development in 10 month old infants is in part predictive for behavioral outcome at age 3, the importance of monitoring child develoment from a very early age is underscored. Disclosure of Interests. The authors declare no conflicts of interests. Full results table and code are available upon request to d.k.weltevreden@uu.nl References 1. Hudson, J. L. et al. Clinical Child and Family Psychology Review 26, 593–641 (July 2023). 2. Eijgermans, D. et al. Children and Youth Services Review 149, 106933 (June 2023). 3. Pratiwi, B. C. et al. Computational Statistics & Data Analysis 185, 107767 (Sept. 2023). 4. Putka, D. J. et al. Organizational Research Methods 21, 689–732 (July 2018).

11:00
Limitations of Idiographic Machine Learning Forecasting in Mental Health Data

ABSTRACT. This talk evaluates machine learning methods’ performance using EMA and passive sensing data from the Brighten study. Comparing models with varied exogenous inputs, we find that simple baselines outperform sophisticated models over longer horizons. Findings highlight limits of current methods and the need to reassess symptom-level EMA features.

Personalized machine learning models in clinical psychology are the subject of significant interest for their potential to improve treatment and predict symptomatology. However, much research has focused on predicting short-term horizons following substantial training windows, using ex-post (known future information from other variables) forecasts to improve their prediction. Few studies explore their effectiveness at predicting symptoms at long horizons, without known exogenous regressors. This calls into question their usefulness in the world of digital mental health, as well as the quality of the ecological momentary assessment (EMA) data being collected for model training.

This research examines current limitations of idiographic machine learning forecasting with EMA and passive sensor data from 67 patients with depression over 66 days from the Brighten dataset. Forecasting models such as ARIMA, neural networks, ETS, personalized mean, and an ensemble combination of all models (i.e., an average of all algorithms) were trained on 15 datapoints, with forecast horizons extending to 51 datapoints. These models were analyzed with known exogenous variables (ex-post), predicted exogenous variables (ex-ante), and without any included exogenous information. These results were compared to simpler models and baseline statistics, such as individual EMA means and ETS models.

Results indicate that ARIMA and combination models (using a weighted average of all the models), perform well or comparably at short horizons. However, more sophisticated models struggle to make accurate predictions the greater the distance from the models’ seen training data. Summary statistics or simple models (e. g. individuals’ affect means or ETS models) show lower MSE values in comparison, with ex-ante evaluation performing slightly better than ex-post at longer horizons as well. These results suggest that machine learning for symptom prediction should be assessed at longer horizons, and that EMA features should be further investigated for functionality.

11:20
Machine learning in the prediction of treatment response for emotional disorders: A systematic review and meta-analysis

ABSTRACT. Background: Emotional disorders such as depression and anxiety affect millions globally and pose a significant burden on public health. Personalized treatment approaches using machine learning (ML) to predict treatment response could revolutionize treatment strategies. However, there is limited evidence as to whether ML is successful in predicting treatment outcomes. This meta-analysis aims to evaluate the accuracy of ML algorithms in predicting binary treatment response (responder vs. non-responder) to evidence-based psychotherapies, pharmacotherapies, and other treatments for emotional disorders, and to examine moderators of prediction accuracy.

Methods: Following PRISMA guidelines, a comprehensive literature search was conducted across PubMed and PsycINFO from January 1st, 2010 to March 27th, 2025. Studies were included if they used ML methods to predict treatment response in patients with emotional disorders. Data were extracted on sample size, type of treatment, predictors used, ML methods, and prediction accuracy. Meta-analytic techniques were used to synthesize findings and identify moderators of prediction accuracy.

Results: Out of 3816 non-duplicate records, 155 studies met inclusion criteria. The overall mean prediction accuracy was 0.76 (95 % CI: 0.74-0.78), and the mean area under the curve was 0.80 indicating good discrimination. The average sensitivity and specificity were 0.73 and 0.75, respectively. Moderator analyses indicated that studies using more robust cross-validation procedures exhibited higher prediction accuracy. Neuroimaging data as predictors were associated with higher accuracy compared to clinical and demographic data. Moreover, results indicated that studies with larger responder rates, as well as those that did not correct for imbalances in outcome rates, were associated with higher prediction accuracy.

Conclusions: ML methods show promise in predicting treatment response for emotional disorders, with varying degrees of accuracy depending on the type of predictors used and the rigor of methodological procedures implemented. Future research should focus on improving methodological integrity and exploring the integration of multimodal data to enhance prediction accuracy.

10:20-12:00 Session 13F: SPU
10:20
Controlling Structure Under Uncertainty: Decision-Theoretic Topology for Extreme-Scale Scientific Simulation

ABSTRACT. Classical scientific computing pipelines optimize numerical accuracy or residual error, yet many downstream decisions depend primarily on the stability of structural or topological properties rather than pixel- or state-wise fidelity. We propose a decision-theoretic framework for adaptive scientific computing in which topological summaries of intermediate solutions are treated as the system state, and computational resources are allocated to minimize uncertainty in task-relevant structural descriptors. Using persistent homology as a multiscale structural representation, we define uncertainty measures in topological space and formulate adaptive computation as an optimal experimental design and control problem. A closed-loop algorithm iteratively selects simulation queries, measurement operators, or discretization refinements to reduce expected topological risk. We demonstrate the approach on synthetic PDE-driven fields and multimodal imaging examples, showing that decision-stable structural states are reached with substantially fewer computational resources than error-driven or image-centric baselines. The results suggest a shift from accuracy-centric to decision-centric scientific computing, where structure under uncertainty becomes a primary control objective.

10:40
Comparison of Epistemic Uncertainty Quantification Methods for Out-of-Distribution Detection in Autoencoder–RNN Surrogate Model of Molecular-Continuum Flow Simulations

ABSTRACT. Neural network surrogates have been shown to decrease computational costs of simulations, but often at the risk of unreliable predictions. This work integrates and evaluates multiple epistemic uncertainty quantification methods for a reproduced convolutional autoencoder–recurrent neural network surrogate architecture for molecular data in a coupled spatiotemporal molecular-continuum flow prediction. The surrogate is trained on an idealized Kármán vortex street dataset gen- erated using the molecular–continuum simulation framework MaMiCo, and evaluated on three out-of-distribution datasets with progressively in- creasing shifts from the training distributions. In the Autoencoder model, the Deep Ensemble method sets a strong baseline, but after fine-tuning, both Gaussian Processes and Evidential Deep Learning show promis- ing detection skills and faster inference than Deep Ensemble. This trend continues in the Autoencoder-RNN, which employs a propagation ap- proach for Evidential distributions and an RNN-influenced latent space for Gaussian Processes.

11:00
Modelling Extreme Uncertainty: Estimating Maximum Queue Size of Systems with Pareto Inter-Arrival Times and Pareto Service Times

ABSTRACT. We propose an approach to modelling maximum queue sizes for heavy-tail inter-arrivals and service times. We derive models for high percentiles of queue length based on the principle that for subexponential distributions, large deviations of cumulative workload are dominated by single extreme observation. This allows the distribution of aggregate workload to be approximated through the distribu-tion of the maximum service time, leading to tractable models for extreme queue length quantiles. We derive parametric models that require fewer fitted parameters than extreme value methods, including generalized extreme value (GEV), general-ized Pareto (POT), and power-law tail models. Event-driven Monte Carlo simulations of heavy-tailed single-server queue are used to evaluate and compare pro-posed models. We show that the model called Par Sum Exp gives best results

11:20
Quantifying and Mitigating Epistemic Uncertainty in Local Rule‑Based Explanations

ABSTRACT. Uncertainty is intrinsic to statistical learning, arising from multiple sources. One recently examined form is epistemic uncertainty, which stems from the difficulty humans face in understanding or inspecting the internal workings of black‑box models. Explainable AI (XAI) aims to mitigate this by revealing how models operate. However, local post-hoc explainers can sometimes have the opposite effect. In particular, local rule-based methods may produce contradictory explanations across different neighborhoods, undermining user trust and obscuring the model's global logic. We propose a framework for detecting, quantifying, and mitigating inconsistencies among local rule‑based explanations. Our contributions include Conflict‑Conditioned Empirical Disagreement under Uncertainty (CC‑EDU), a metric for neighborhood‑level inconsistency and a restriction mechanism that refines overly general rules. Experiments on selected benchmark datasets show that our framework correctly detects and reduces inter‑rule contradictions while preserving fidelity.

12:00-12:30 Session 14: Poster Session

The posters are the same for all three poster sessions. For the list of posters, please refer to the poster session on Monday, June 29th.

12:30-13:50Lunch
13:50-15:30 Session 15A: CompPet 2
13:50
Generative ML for Future Calorimeter Simulation

ABSTRACT. Generative ML plays a vital role in fast collider simulation for high energy physics. It is only by making simulation inference orders of magnitude faster that sufficient simulated predictions can be generated to make good use of observed data. In this work, I present an overview of the work at DESY to prepare generative simulation techniques for future colliders.

14:10
Integrating Policy and Infrastructure for Effective Data Management at the European XFEL

ABSTRACT. As scientific data volumes at the European XFEL continue to grow rapidly, reaching up to 2 PB per day and peak acquisition rates of up to 15 GB/s per instrument, the need for sustainable storage, efficient processing, and long-term preservation has become critical. To address these challenges, the facility has introduced a new scientific data policy that strengthens life-cycle governance while aligning with FAIR principles (Findable, Accessible, Interoperable, Reusable). The policy embeds sustainability, transparency, collaborative sharing, and responsible reuse of research data directly into the operational ser- vices. A central component is the mandatory integration of Data Management Plans (DMPs) from the experiment planning stage onward, ensuring structured workflows across the entire life-cycle. This governance framework is technically realized through myMdC, the facility’s unified metadata backbone, in operation since the start of user experiments in 2017. myMdC orchestrates metadata exchange across distributed facility ser- vices, enabling real-time dataset registration, metadata injection, em- bargo management, DOI minting, and the integration of FAIR-compliant publication workflows. By consolidating metadata management and embedding policy require- ments directly into operational services, myMdC ensures interoperability, data integrity, and scalable automation throughout the facility. To support these policy-driven services, European XFEL relies on a multi-layered data management infrastructure designed for performance, scalability, and long-term sustainability. The architecture is organised into four tightly integrated storage layers. The Online storage layer functions as a high-speed cache, capturing the extremely high data rates generated during experiments. It feeds into the High-Performance Storage layer, which supports real- time data processing and near-online analysis during beamtime as well as post-experiment evaluation. Both layers are interconnected via a high-bandwidth InfiniBand fabric, including a 4.4 km, 1 Tb/s link between the experiment hall and the DESY computing centre, ensuring rapid and reliable data transport. The third layer, Mass Storage, provides expanded capacity for mid-term data access and detailed reprocessing, while the Tape Archive layer guar- antees secure, cost-effective long-term preservation with a minimum re- tention period of ten years. High-performance and mass storage systems are directly connected to computing clusters, enabling scalable near-online and offline analysis as well as controlled data export. Data reduction strategies and a revised retention model optimise the use of this tiered infrastructure, balancing performance, cost efficiency, and scientific value. Together, the integrated policy framework, metadata orchestration ser- vices and high-performance infrastructure establish a coherent and sus- tainable data ecosystem. This approach demonstrates how governance and technical architecture can be systematically combined to support FAIR-aligned data steward- ship at extreme scale.

14:30
Bit Level Data Unpacking Using Heterogeneous Architecture Setups

ABSTRACT. Modern high-performance computing centers increasingly opt for heterogeneous system designs, integrating general-purpose computing cores with accelerators, to deliver high computing performance and high efficiency. Scientific computing codes need to adapt to this change in cluster architecture, which introduces the challenge of hardware portability. In the following, we introduce a hardware portable code for the implementation of a data unpacking algorithm, a key part of the data processing system for particle detector readout systems, in the environment provided by the C++ abstraction framework with the header-only Alapka library. This concept has great relevance for the high-energy physics research at CERN, where different computer accelerators from diverse vendors are used. This code is portable to CUDA, HIP, OpenMP, and serial code, requiring no tailoring to a specific platform. We validate the approach using the high-throughput data processing requirements of the CMS experiment at CERN and verify the conservative, lossless nature of the given data unpacking procedure. The performance improvement using heterogeneous hardware exceeds 32 times in average throughput for the single-threaded case.

14:50
Computing and FAIR data in Lattice QCD

ABSTRACT. The simulation of the fundamental theory describing the strong interactions, Quantum Chromodynamics (QCD), via its Euclidean-time, discrete formulation on a lattice, consumes a substantial fraction of computer time at leadership HPC facilities. These simulations proceed via Markov chain Monte Carlo, to generate a representative set of configurations of the QCD vacuum, referred to as gauge field ensembles, which are then analyzed in a subsequent step to obtain expectation values of observables of interest.

Multiple international collaborations carry out simulations for the generation of ensembles, differing in their simulation approach, including the choice of discretization of the continuum theory. Each ensemble corresponds to one choice of the QCD parameters such as the QCD coupling, the quark masses, and the extent of the finite volume. When the continuum theory is restored, at the infinite volume limit and for physical values of the quark masses, all discretizations must agree, allowing independent validation across collaborations.

The production of ensembles is a multi-year endeavor, resulting in data sets that are intended to be reused for future analysis campaigns. The value of the generated ensembles thus extends over time, and it is common practice that each collaboration stores, retrieves, and reuses their ensembles, in many cases also sharing the ensembles with researchers beyond the members of the collaboration. Nowadays, the generation of a typical ensemble consisting of O(10^3) configurations can require up to O(100) million core hours and O(10) to O(100) TBytes in storage, depending on the parameters used, such as the number of grid points.

The ability to share data based on the FAIR principles is, therefore, of key importance for this research community. This insight led already as early as 2003 to an international effort on establishing the International Lattice Data Grid (ILDG). ILDG has been established as a federation of autonomous regional grids, within a single Virtual Organization. Examples of regional grids include CSSM, JLDG, LDG, UK LFT, and USQCD for Australia, Japan, continental Europe, the UK, and the US, respectively, each operating their own services.

In this contribution, we will review the state-of-the-art in lattice QCD simulations, providing an overview of their computational and storage requirements. We will provide a summary of the evolution of ILDG, which initially leveraged grid technologies such as those used for the LHC grid at that time. We will highlight the efforts of international Working groups towards establishing the relevant specifications, including a metadata specification that was formalised in XML schemata (QCDml). This specification, which is based on a community-wide consensus to concisely mark up the gauge configurations, is a key asset of ILDG. The groups furthermore developed relevant middleware tools for facilitating the use of ILDG services. Notably, the middleware and metadata specifications developed around 2007 already adhered to most of the FAIR principles, which were formulated nearly a decade later.

Over the past three years, we have modernised key elements of ILDG. During this update, regular interactions with the international research community took place in order to also collect new user requirements. Key goals had been full compliance with the FAIR principles and an infrastructure based on cloud technologies. Key elements of ILDG 2.0 that will be detailed in our contribution include: update of the metadata specification for improved FAIR compliance; global user management through a dedicated INDIGO IAM instance, enabling token-based, fine-grained access control that replaces grid certificates; a metadata catalog supporting multiple, freely configurable schemata; and modular file catalogs. Together, these provide fine-grained embargo handling, facilitate the use of data stores based on cloud technologies, and allow for a flexible framework that extends naturally to collaboration-internal data sharing.

ILDG 2.0 is now fully operational, and its architecture is being explored for use cases beyond lattice QCD, such as axion experiments and radio astronomy

15:10
Asapo: a data streaming platform for high performance data processing pipelines

ABSTRACT. High-throughput scientific experiments generate massive volumes of data that require near–real-time processing for time-critical decision-making. Developing robust stream processing in distributed computing environments, however, presents significant challenges. Asapo is a streaming platform developed and deployed at DESY to enable scalable and reliable online data processing. The Asapo platform and data-processing pipelines based on this platform are provided to multiple experimental stations across the campus. The service has been positively evaluated by scientists, demonstrating its ability to accelerate scientific progress [1, 2, 3].

Conceptually, Asapo follows a modular microservice-based software architecture and operates independently of the experiment control system. The platform enables the distribution of arbitrary instrument data, including metadata, across distributed memory buffers, ensuring uninterrupted data processing even at high acquisition rates. The current deployment features a 1.5 TB in-memory cache and supports message rates of up to 20 kHz. Additionally, the system allows users to simultaneously buffer data for online analysis while permanently storing it to disk, and provides seamless switching between online and offline analysis modes. Asapo is explicitly designed for efficient transmission of multi-megabyte messages and supports multiple data-transfer technologies, including TCP and InfiniBand.

A central design objective of Asapo is to simplify the implementation of data-reduction pipelines, allowing users to process data without requiring persistent storage of intermediate results. Only the final reduced “distillate” needs to be archived, which significantly reduces storage overhead and simplifies workflow management. The platform supports both scale-up and scale-out deployment scenarios with minimal configuration overhead, enabling efficient operation from small single-node setups to large distributed clusters. From a user perspective, Asapo is designed to start simple and evolve toward more advanced configurations, including multi-sub-channel data streams and complex processing topologies, without requiring fundamental architectural changes. To facilitate integration into existing scientific software environments, Asapo provides native bindings for C/C++ and Python.

Asapo connectors run close to the detectors, where they acquire data, encapsulate them into Asapo messages, and transmit them to the Asapo cluster. Within the cluster, incoming data are temporarily stored in the in-memory cache and persistently written to disk in the NeXus format. In parallel with raw data acquisition and storage, data-processing pipelines execute on high-performance computing (HPC) resources. The processed results are made available to users together with the corresponding raw data, enabling rapid feedback during experiments.

For experiments where data are acquired from multiple sources (for example, several detectors or independent detector modules), Asapo includes built-in synchronization mechanisms. This provides a unified and consistent access to data originating from different detectors.

[1] Mikhail Karnevskiy et al. “Automated pipeline processing X-ray diffraction data from dynamic compression experiments on the Extreme Conditions Beamline of PETRA III”. In: Journal of Applied Crystallography 57.4 (Aug. 2024), pp. 1217–1228. doi: 10.1107/S1600576724004114. [2] S. Haas et al. “The new small-angle X-ray scattering beamline for materials research at PETRAIII: SAXSMAT beamline P62”. In: Journal of Synchrotron Radiation 30.6 (Nov. 2023), pp. 1156–1167. doi: 10.1107/S1600577523008603. [3] Thomas White et al. “Real-time data processing for serial crystallography experiments”. In: IUCrJ 12.1 (Jan. 2025). doi: 10.1107/S2052252524011837.

15:30
Key4hep - The Common Software Stack for Future Particle Physics Experiments

ABSTRACT. Studying the capabilities and physics reach for experiments at future particle physics facilities requires a large and diverse ecosystem of different software packages. These have to be built and deployed in a consistent manner to allow individual physicists to conduct their research without having to worry about the details. The Key4hep software stack aims at providing this experience by combining existing components and selected dedicated developments into a coherent software stack.

13:50-15:30 Session 15B: CV-LNM 2
13:50
Geometry-aware Recognition of Mouth Articulations for Sign Language Understanding

ABSTRACT. Continuous recognition of vowel-related mouth articulations in Japanese Sign Language (JSL) requires modeling fine-grained geometric structure and temporally coherent lip dynamics. We formulate vowel-level mouth articulation recognition as a structured spatio-temporal graph learning problem over lip landmark trajectories. A geometry-aware multi-branch ST-GCN architecture is proposed, operating on 41 nose-anchored facial landmarks and enriched with linear and angular motion descriptors to capture both spatial configuration and dynamic deformation. Experiments on a publicly available JSL dataset show that the proposed Lip-STGCN outperforms baseline ST-GCN and tree-based models under a strict cross-subject evaluation protocol, achieving a macro F1-score of 0.63. Ablation analysis confirms that jointly modeling structured geometry and motion dynamics is essential for robust continuous vowel articulation recognition.

14:10
Hybrid Privacy-preserving Histopathological Image Classification Using Fully Homomorphic Encryption

ABSTRACT. Automated analysis of histopathological scans allows for cancer diagnosis, yet deploying deep learning models for automated classification brings the risk of misuse of patient data and privacy violations. Fully homomorphic encryption (FHE) is a cryptographic paradigm that allows for computation to be performed on encrypted data by untrusted parties, without decrypting the data. While FHE solves the problem of preserving privacy for remote computation on sensitive data, it is prohibitively computationally expensive when used with deep neural networks. To address this bottleneck, this study proposes a hybrid architecture that distributes computation between the client and the server. Our method utilizes a pretrained DINO ViT model for local image feature extraction on the client, followed by dimensionality reduction using the principal and neighborhood component analysis methods. These reduced features are then encrypted and classified remotely by a support vector machine (SVM), that can be executed in untrusted environments using FHE. We evaluated this approach on demonstrating that feature dimensions can be reduced by approximately 50% with less than one percentage point decrease in classification accuracy, while dramatically reducing encrypted model inference time.

14:30
PISC: Physics-Informed Scene Constraints Method for Markerless 3D Human Localization for Safe Human-Robot Collaboration

ABSTRACT. In this paper we present a new markerless and physics aware method for 3D human localization designed for human-machine interaction with collaborative robots (cobots). Unlike typical SMPL-based pose regressors that often yield unrealistic body configurations, or voxel-based models that require retraining for each camera setup, our method uses multi-view geometric constraints—specifically epipolar consistency to guide physically plausible pose estimation. Our proposed approach eliminates the need for wearable sensors and adapts naturally to reconfigurable environments, improving localization robustness and realism in shared workspaces.

14:50
Vision-Guided Agricultural Robot for Crop Detection Using Edge Computing on Embedded Hardware

ABSTRACT. Agricultural automation is increasingly important for addressing global food security challenges and labor shortages in the farming sector. This paper presents a vision-guided robotic system for real-time fruit and vegetable detection deployed on embedded hardware. The system utilizes LeYOLO-Nano, a lightweight object detection model optimized for edge devices, running on an NVIDIA Jetson platform. We trained and evaluated the model on a multi-class dataset containing 22,157 images across 31 different categories of fruits and vegetables. The dataset was split into training (17,731 images), validation (2,797 images), and testing (1,629 images) subsets to ensure robust model generalization. The LeYOLO-Nano architecture was chosen for its minimal computational requirements while maintaining detection accuracy, making it suitable for real-time inference on resource-constrained embedded systems. Our experimental results demonstrate the feasibility of deploying advanced computer vision models on embedded platforms for agricultural applications, enabling autonomous crop detection and harvesting assistance in field conditions. The proposed system represents a step toward affordable and efficient agricultural robotics that can operate in real farming environments.

15:10
SG-DiT: Semantic-Guided Sign Language Anonymization Balancing Privacy and Linguistic Fidelity

ABSTRACT. Sign language motions contain individual-specific kinematic features. As the engineering applications of sign language become more widespread, privacy protection of sign language data has emerged as a new challenge. This paper proposes a diffusion model-based approach for sign language motion anonymization. The proposed framework combines conditional diffusion processes with adversarial training to transform identity features while preserving semantic information. For the design and preliminary validation of the proposed model, we conduct a proof-of-concept experiment using a subset of 22 signers from the ASL100 dataset of WLASL, which demonstrates the feasibility of the proposed approach for sign language anonymization.

13:50-15:30 Session 15C: MMS 3 & TAMCS
13:50
Cross-Validation-Based Hierarchical Decision Tree Framework for Dispersed Data Classification

ABSTRACT. The proliferation of fragmented, high-dimensional data across independent sources renders centralized classification impractical due to structural and privacy constraints. Existing hierarchical frameworks often worsen these challenges by taking part of the already limited test data for validation, reducing the reliability of final evaluation. This paper introduces a hierarchical decision tree architecture for dispersed sources that removes the need to extract a validation subset from the test data during global training. The method uses a two‑level learning strategy. At the local level, decision trees are trained independently on each table using stratified cross‑validation, and out‑of‑fold probability estimates are generated for all training objects to ensure reliable, leakage‑free predictions. These vectors represent the predictive behaviour of each local view. At the global level, probability vectors from all sources are concatenated into a unified representation used to train a global decision tree that integrates information across views and produces the final classification. Experimental evaluation on multiclass benchmark datasets with varying levels of dispersion shows that the proposed method achieves performance comparable to other hierarchical and ensemble approaches designed for distributed data. The comparison included methods that train separate local classifiers and combine their outputs at the decision level. The results demonstrate that the cross‑validation‑based hierarchical strategy makes more effective use of limited and fragmented information while maintaining a strict separation between training and testing data. As a result, the global classifier is trained in a fully leakage‑free manner and remains robust even when individual local tables contain only a small or highly uneven set of features.

14:10
Uncertainty Quantification for Multiscale Model of Tritium Breeding Materials

ABSTRACT. In this talk, we present an approach to modelling materials to support the development of a fusion-relevant tritium breeder module, with a primary focus on lithium-based ceramics.

The model approaches the problem via a multiscale method, sequentially coupling scales until the global macroscopic quantities of interest are computed. Here, two atomistic levels are considered to resolve the small-scale processes, employing Density Functional Theory (DFT) and classical Molecular Dynamics (MD) for high-fidelity modelling. The larger scales, which include single-crystal, crystal-amorphous multi-grain and porous systems, are solved via Finite Element Modelling (FEM). Finally, a global model capable of producing quantities of interest is assembled.

The work describes the application of the model to support an experiment jointly undertaken by Bangor University and the University of Birmingham. The experiment comprises the preparation of lithium metatitanate samples, the manufacturing of a tritium-breeding module, and the irradiation of Lithium-containing materials under neutron flux at the High Flux Accelerator-Driven Neutron Facility (HF-ADNeF). The goal of the experiment is to demonstrate a proof of concept for tritium breeding capabilities with this approach, as well as to demonstrate the model-based predictability of such tritium breeding devices, with further goals of material and device testing and optimisation involving simulation-experiment interaction.

The specific focus of the work is on Uncertainty Quantification (UQ) and Sensitivity Analysis (SA) of the material model. The UQ methods support various sources of uncertainty, aiming for rigorous validation and realistic prediction capabilities of the code. SA supported both the model and the experimental design, identifying the most important physical factor to be studied via simulation and at the experimental facility.

The UQ approach comprises two main methodological thrusts. The first is the application of the EasyVVUQ library capabilities to a model based on the FESTIM code. The other approach is to develop a UQ workflow fully implemented within the MOOSE framework, applying the Stochastic Tools (STM) Module to the TMAP8 code, both of which are implemented within the same simulation framework.

14:30
Beyond Black-Box Agents: Explainable and Validatable Generative ABMs

ABSTRACT. Generative agent-based models (GABMs) that embed large language models (LLMs) as autonomous agents have attracted growing interest for simulating human behavior and communication. However, because LLMs operate as opaque black boxes, such models are often difficult to validate, interpret, or replicate, which limits their reliability for theory building in the social sciences. This study presents an exploration of an alternative approach in which LLMs serve as external assistants rather than as agents within simulations.We refer to this as XABM (eXplainable Generative ABM). In this framework, LLMs generate explicit behavioral rules, identify relevant decision variables, and translate theoretical model descriptions into executable simulation prototypes, while human researchers retain full control over model structure and validation. By externalizing the decision logic into transparent, inspectable rules, this approach aims to make computational modeling more interpretable and reproducible. The current implementation is evaluated through a series of preliminary tests, including reproducing the canonical Schelling Segregation Model and preparing an exploratory case study on novel social phenomena. These initial results suggest that using LLMs as rule generators offers a promising direction for transparent and explainable generative agent-based modeling, Together, these applications show how rule-generating uses of LLMs can enhance transparency, reproducibility, and scientific validity in generative agent-based modeling.

14:50
Trustworthy Data Foundations for AI-Driven Analytics in Distributed IoT: A Validation-First Methodology

ABSTRACT. Reliable large language model (LLM)-assisted operations in distributed internet of things (IoT) systems depend fundamentally on the quality, structure, and provenance of telemetry available at inference time. We present a validation-first methodology for AI-driven analytics that enforces end-to-end verification of the data pipeline before permitting LLM inference and retrieval-augmented generation. The approach formalises three pre-inference quality gates addressing infrastructure and schema conformance, data integrity across freshness and continuity constraints, and context readiness through deterministic, ID-bound prompt construction. This methodology is implemented in HOMEPOT (Homogeneous Cyber Management of End-Points and Operational Technology), a unified endpoint and IoT management platform supporting heterogeneous devices and MQTT-connected sensors. We conducted a 10-day Technology Readiness Level (TRL-4) pilot involving 10 devices, processed over 140,000 telemetry samples and health checks, and performed substantial log and state-transition activity. Integrity indicators demonstrate 100\% non-null completeness for key telemetry fields and sub-minute maximum inter-arrival gaps, confirming strong temporal continuity. These results define measurable readiness conditions under which AI inference is permitted rather than assumed, providing a reproducible blueprint for dependable AI integration in distributed IoT environments transitioning toward pilot-scale deployment.

13:50-15:30 Session 15D: MCDM 2
13:50
Optimization of Inhibitory Rules

ABSTRACT. Representing of knowledge in the form of rules is one of the most popular methods due to its intuitiveness and transparency. Shorter rules are easier to understand and interpret. There are various types of rules, e.g., decision rules, action rules, probabilistic rules, non-deterministic rules, and many others. These rules differ in their approach to decision-making and the method of their induction.

This paper focuses on the exploration of inhibitory rules, which are defined by the expression ``attribute $\neq$ decision'' on the right-hand side. In certain cases, such rules can convey more information about datasets than conventional decision rules. It is known that the problem of constructing inhibitory rules with minimum length is NP-hard. Therefore, various approaches are used to obtain approximate rules. In this work, three novel algorithms for inducing inhibitory rules and systems of such rules are studied. In particular, it was shown that, under the assumption $P \neq NP$, the m-greedy algorithm achieves an approximation ratio that is close to the best possible achievable by any polynomial-time algorithm. Taking into account the perspective of knowledge representation, we analyze experimentally effectiveness of proposed algorithms, particularly regarding the minimization of rule length.

14:10
Greedy Algorithm for Modeling Approximate Decision Trees for Distributed Decision Tables

ABSTRACT. This paper considers the following problem in multi-agent decision making: given a tuple of $k$-valued decision tables, called a distributed decision table, construct an approximate decision tree of a given accuracy and minimum depth that can be applied simultaneously to each of the tables (a shared decision tree). This problem is $NP$-hard. Therefore, we focus on a greedy algorithm for constructing an approximate shared decision tree. Because the number of nodes in such trees can grow exponentially with the total number of elements in the decision tables, we do not construct the entire decision tree but instead simulate its operation with a given tuple of attribute values. We obtain accuracy bounds for the depth of decision trees constructed by this greedy algorithm. As an example, we study distributed decision tables in which each decision table is associated with the problem of recognizing the color of a point from a finite set of two-color points in the plane.

14:30
Exponentiation Operators for Asymmetric Interval Numbers and Their Properties

ABSTRACT. Asymmetric Interval Numbers (AINs) represent uncertain quantities by an interval and a representative value defined as the expectation with respect to the auxiliary distribution. Unlike classical interval numbers, which specify only admissible ranges, AINs additionally encode the directional character of uncertainty. To support arithmetic operations, each AIN is associated with a canonical piecewise-constant auxiliary distribution consisting of two uniform segments determined by the interval bounds and the representative value. This distribution serves as a computational tool for evaluating operations on uncertain quantities.

The existing AIN arithmetic covers basic algebraic operations but does not support nonlinear transformations in which uncertainty appears in the exponent. This paper extends AIN arithmetic by deriving analytic exponentiation operators for scalar-base exponentiation~$k^X$ and exponentiation between two uncertain quantities~$X^Y$. For $k^X$, the operator is obtained in closed form by applying the Law of the Unconscious Statistician (LOTUS) to the auxiliary distribution. For $X^Y$, the representative value is defined by a two-step LOTUS construction that evaluates the joint expectation under the product of auxiliary densities using numerical quadrature. Numerical experiments confirm consistency of both operators with Monte Carlo integration of the auxiliary densities. The proposed extension enables direct application of AINs in nonlinear decision and predictive models involving exponential-type relationships.

13:50-15:30 Session 15E: ComPsy 2
13:50
Graph-Theoretical Analysis of the Gut-Microbiome-Brain Axis: Identifying Mediators of Suicidal Ideation

ABSTRACT. Despite emerging evidence linking the gut microbiome to suicidal ideation, their complex interplay is typically analyzed through pairwise correlations, leaving the systemic cascade poorly understood. To overcome this limitation and map the directional, multi-step mechanisms of this system, we formalized the gut-brain-SI axis as a directed causal knowledge graph. By systematically extracting causal links from empirical scientific literature to curate metabolic pathways across the vagus nerve, hypothalamic-pituitary-adrenal axis, and systemic circulation, we build a network comprising 41 nodes and 87 edges. Topological analysis revealed a highly structured but sparse architecture characterized by four distinct functional modules. Within this framework, we identified specific nodes responsible for systemic perturbation. Intestinal permeability demonstrated the maximum reach efficiency, acting as the primary upstream catalyst for dysbiosis. Conversely, neuroinflammation emerged as the dominant integration hub, exhibiting the highest degree centrality and betweenness centrality before propagating signals to psychiatric endpoints. These quantitative findings highlight potential topological mediators that could translate localized physiological alterations into suicidal ideation vulnerability. Ultimately, this static causal graph establishes the structural foundation required for future in silico system dynamics modeling of targeted microbial interventions.

14:10
A Generative Multi-Agent Framework for Modeling Depressive Language Entrainment

ABSTRACT. When people interact, they adjust their language to each other. This process of lexical entrainment fosters social agreeableness, but may also create jargon, hypes, memes, and even political polarization come about, when communities converge on a shared vernacular. Here we study what happens when people with depression interact with each other in online social networks with varying degrees of lexical entrainment. We connect generative AI agents in a social network, each endowed with a personal mental health profile. The agents exchange Tweet-like messages that are shaped by their individual score on a PhQ-9 depression questionnaire, while the exchanges induce lexical entrainment. We find that cognitive distortions, a style of thinking associated with and possibly causative of internalizing disorders, can rapidly diffuse in social networks through the process of lexical entrainment, creating a depressogenic pyscho-social environment that may lead to worse mental health outcomes throughout the community. Our results may inform targeted approaches to remove risk factors associated with social media use and mitigate the effects of lexical entrainment in communities with mental health challenges.

14:30
Formal Theory Construction of Gender Dysphoria

ABSTRACT. Gender dysphoria (GD) is understood as an inner sense of misalignment between a person’s gender and their biological sex which must cause clinically significant distress to meet diagnostic criteria. A major diculty in studying GD is a literature riddled with contradictions or lacking in generally held agreements as well as high levels of heterogeneity in clinical samples. We aimed to respond to a need for a robust theory of GD. We first identified relevant phenomena of GD through a large literature review. Next, we developed a verbal theory of GD based on the robust phenomena. The main proposition was that a person who is misaligned between their biological sex, gender and sexuality relative to strong positive correlations and external pressures will be in a state of gender distress. Following, we selected an Ising network model as our formal model because we deemed it most loyal to the verbal theory. This is due to its ability to represent (mis)alignment by outputting an informative Hamiltonian value reflective of the degree of alignment in a network. Since more misalignment leads to a higher Hamiltonian value, we conceptualise it as reflective of the degree of GD distress. In the fourth step we assessed the adequacy of the proposed model through multiple means: simulations, data analysis, interviews, and model extensions. Finally, we evaluated the overall worth of the theory on multiple fronts. Our results show that our theory of GD is robust and promising, with a potential for a multitude of real life and practical uses such as within the realm of therapy or predicting the success of transition treatment.

13:50-15:30 Session 15F: WTCS
13:50
Enhancing Critical Thinking with Multimodal Generative AI

ABSTRACT. The increasing popularity of Artificial Intelligence (AI) in education has transformed learning methodologies by enabling personalized and interactive experiences. However, over-reliance on generative AI technologies has raised concerns about the decline of critical thinking skills, particularly among elementary school students in mathematics. To address this issue, an AI-based educational system is proposed to enhance critical thinking and cognitive reasoning abilities among sixth-grade students. The system employs storytelling and Socratic questioning to promote engagement and cognitive development. It is built on Large Language Models (LLMs) for text and image generation, incorporating external knowledge to improve accuracy and reduce errors. Through the integration of mathematical story generation and contextual visual representations, the system aims to foster deeper learning and problem-solving skills among young learners.

14:10
Student Research Achievements through Computational HPC Course Sequences

ABSTRACT. We describe an approach to engaging students in ongoing computational research into the Collatz Conjecture by creating a sequence of high performance computing (HPC) courses over six semesters. Simultaneously focusing on the fundamentals of parallel programming and introducing new students to the current progress on writing and optimizing code used in the research, each semester's course remained self-contained. Topics of parallel programming required to understand and further the project, as well as goals based on prior results and analysis performed by students were covered in each of these semesters. The evolution of the code is described from semester to semester, with results in code optimization and design producing results heretofore unknown.

14:30
Reframing Cybersecurity Education through Technical, Organisational, and Regulatory Integration: Google Grant Implementation Case Study at WUT

ABSTRACT. Cybersecurity education in computational science is increasingly shaped by socio-technical threats and EU regulatory obligations. In particular, NIS2, CER and DORA emphasize management accountability, incident reporting, resilience testing and systematic risk management, which cannot be addressed by tool-centred curricula alone. This paper reframes cybersecurity education as an integration of technical, organisational and regulatory competencies and reports on the implementation of the Google Cybersecurity Seminars (GCS) programme at the Warsaw University of Technology (WUT). The programme (2024–2026) is delivered across three faculties and combines a 135-hour curriculum (technical laboratories aligned with ISO/IEC 27001 and common security testing methodologies, organisational governance modules, and legal/regulatory interpretation) with peer-led outreach in which students run workshops for primary and secondary schools. We describe the supporting infrastructure (Cybersecurity Laboratory, Clinic, and an e-learning platform) and present preliminary evaluation data. In the first project year, 271 students enrolled, 137 completed the full educational track and 73 received specialist certificates after additional laboratory and community-teaching activities. A CAWI survey conducted in February 2026 (n=40) indicates substantial self-reported gains: 85% reported a clear/very high increase in knowledge, 97.4% increased confidence in teaching others, and 92.5% reported changes in online behaviour. The case study suggests that regulatory-driven integration can improve both professional readiness and community impact.

14:50
Computational Andragogy: Designing Accountable Practice in Computational Science

ABSTRACT. Computational thinking has become central to computing and data science education, emphasising abstraction, representation, and automation. However, much computational instruction remains implicitly pedagogical, focusing on procedural mastery and externally defined performance metrics. Such assumptions do not always align with adult learning contexts, where learners bring prior experience and are expected to exercise autonomy and judgment. This paper introduces computational andragogy, a framework for designing computational learning environments for adult learners. Drawing on adult learning theory, the framework shifts attention from executing predefined tasks toward producing accountable computational work. Computational andragogy is operationalised through four design principles: situated constraints, experience-integrated engagement, decision-bearing outputs, and explicit consequences. The framework is illustrated through a field-based machine learning assignment in which students collected observational data in a natural history museum to train classification models.

15:10
Infusing Computational Science into Computer Science: A Nifty SIMD Assignment

ABSTRACT. Successfully solving challenging problems arising from real-world applications in computational science and engineering frequently requires a concerted effort from different scientific disciplines. Besides leveraging knowledge of science, technology, and mathematics, it is often crucial to also harness practical skills from computer science. Therefore, it is no surprise that elements from computer science are taught in education for computational scientists. However, it is rare to turn this process around and integrate aspects of computational science in courses for computer scientists. We describe a programming assignment that has been successfully taught in classroom to provide inspiration for computational science to computer science undergraduates. The assignment used in a course on performance engineering consists of programming SIMD instructions to implement automatic differentiation. We describe its tasks, share the corresponding solutions, and discuss student feedback.

15:30
SEAVEAtk: Building VVUQ Competency for Exascale Computational Science

ABSTRACT. High-stakes computational simulations increasingly inform consequential scientific and societal decisions, yet practitioners are rarely equipped with the rigorous methods required to validate, verify, and quantify uncertainty (VVUQ) in their results. As simulations scale toward exascale environments, and as artificial intelligence and quantum computing reshape what is computationally possible, the risk of acting on unvalidated outputs grows. Equipping the computational science community with VVUQ competencies is no longer optional; it is foundational.

To address this, we present the Software Environment for Actionable and VVUQ-evaluated Exascale Applications toolkit (SEAVEAtk), an open-source platform that integrates seven interoperable components. These include EasyVVUQ for VVUQ workflows, FabSim3 for automation and tool integration, EasySurrogate for the construction of surrogate models in multiscale simulations, MUSCLE3 for multiscale model coupling, the MOGP Emulator for fitting Gaussian process emulators to simulation outputs, QCG-PilotJob for executing application workflows on high-performance computing (HPC) systems, and RADICAL-Cybertools for supporting computation across HPC and distributed computing infrastructures.

SEAVEAtk has already been introduced to students, researchers, and industry practitioners through in-person and online workshops, hackathons, and courses, demonstrating impact across multiple scientific domains and providing a foundation on which a more structured curriculum can be built. Drawing on these components as a pedagogical basis, we propose a progressive curriculum in which learners advance from introductory sensitivity analysis through surrogate modelling to full HPC-scale deployment, always within realistic research workflows rather than contrived exercises. Open-access, domain-specific materials would be designed for independent adoption, with the explicit aim of removing the barrier of ongoing academic dependency.

We set out our vision and welcome perspectives from the workshop community about the design principles that should guide its development, including which VVUQ competencies to prioritise, how to sequence learning across career stages, and how toolkit-anchored approaches can be adapted across disciplines.

17:00-17:30Coffee Break