PROGRAM
      Days: Monday, August 25th Tuesday, August 26th Wednesday, August 27th Thursday, August 28th Friday, August 29th
Monday, August 25th
View this program: with abstractssession overviewtalk overview
10:00-10:30 Session 6A: HiPES.1
Chair: 
Raffaele Montella (Università degli Studi di Napoli Parthenope, Italy)
Location: BAR 205
| 10:00 | A framework for flooding early warning leveraging AI, HPC, and computing continuum (abstract) | 
10:10-10:30 Session 7: WSCC.1
Chair: 
Massimo Torquati (University of Pisa, Italy)
Location: BAR I88
| 10:10 | Efficient FPGA-based GAN Accelerator Core for Edge-AI Platforms (abstract) | 
10:30-11:00Coffee Break
11:00-12:20 Session 8A: WSCC.2
Chair: 
Massimo Torquati (University of Pisa, Italy)
Location: BAR I88
| 11:00 | Simplifying distributed workflows: A portable approach for Cloud and HPC (abstract) | 
| 11:20 | HPC Software as a Service: A Flexible Approach to Data Logistics (abstract) | 
| 11:40 | A Holistic Approach to Complexity Management and Multidimensional Analysis in Computing Continuum (abstract) | 
| 12:00 | Light Weight Scalable DevOps for Cloud Robotics (abstract) | 
11:00-12:30 Session 8B: DynResHPC.1
Chair: 
Sergio Iserte (Barcelona Supercomputing Center, Spain)
Location: BAR 106
| 11:00 | Design Principles of Dynamic Resource Management for High-Performance Parallel Programming Models (abstract) | 
| 11:30 | A Case Study for Resolving Composability Issues Using a Shared CPU Resource Coordinator (abstract) | 
| 12:00 | Experimental Evaluation of Scheduling Strategies for Evolving Workflow-Based Applications (abstract) | 
11:00-12:30 Session 8C: HiPES.2
Chair: 
Raffaele Montella (Università degli Studi di Napoli Parthenope, Italy)
Location: BAR 205
| 11:00 | Thread Monitoring Tool: transparent characterization of threading patterns with eBPF (abstract) | 
| 11:30 | Accelerating SWIRL Workflows: A High-Performance Rust Backend for Distributed Execution (abstract) | 
| 12:00 | Building Parallel Machine Learning Workflows in PyCOMPSs: The Case Study of Tsunami Forecasting (abstract) | 
11:00-11:50 Session 8D: GraphSys.1
Chair: 
Tiziano De Matteis (Vrije Universiteit Amsterdam, Netherlands)
Location: BAR 218
| 11:00 | A Comparative Study of Streaming Graph Processing Systems (abstract) | 
| 11:25 | A Unified Ontology for Scalable Knowledge Graph–Driven Operational Data Analytics in High-Performance Computing Systems (abstract) | 
12:30-14:00Lunch Break
14:00-15:30 Session 11A: DynResHPC.2
Chair: 
Sergio Iserte (Barcelona Supercomputing Center, Spain)
Location: BAR 106
| 14:00 | Comparative Analysis of Algorithms for Malleability Decision-Making in Applications and File Systems (abstract) | 
| 14:30 | Malleability in LAIK with MPI Dynamic Processes and PSets (abstract) | 
| 15:00 | Dynamic Data Redistribution for Malleable MPI Frameworks through Virtual Topologies (abstract) PRESENTER: Ahmad Tarraf | 
| 15:15 | Dynamic reconfiguration for malleable applications using RMA (abstract) | 
14:00-14:30 Session 11B: HiPES.3
Chair: 
Raffaele Montella (Università degli Studi di Napoli Parthenope, Italy)
Location: BAR 205
| 14:00 | A Computer-aided Framework for Detecting Osteosarcoma in Computed Tomography Scans (abstract) | 
14:30-15:00 Session 12: HiPES Panel: Discussing the vision about the high-performance cloud computing in eScience application
Location: BAR 205
14:40-15:30 Session 13: GraphSys.2
Chair: 
Tiziano De Matteis (Vrije Universiteit Amsterdam, Netherlands)
Location: BAR 218
| 14:40 | Efficient handling of sparse vectors for parallel nonblocking execution in GraphBLAS (abstract) | 
| 15:05 | Millibenchmarking: Using Graph Sampling for Ranking GPU PageRank Implementations (abstract) | 
15:30-16:00Coffee Break
17:00-17:30 Session 17: GraphSys: Panel
Chair: 
Jože M. Rožanec (Jožef Stefan Institute, Slovenia)
Location: BAR 218
Tuesday, August 26th
View this program: with abstractssession overviewtalk overview
09:00-09:50 Session 19B: VHPC Keynote Tutorial: Writing a hypervisor from scratch. Seiya Nuta, Vercel Inc.
Location: BAR 218
09:50-10:30 Session 23: VHPC.1
Chair: 
Michael Alexander (Austrian Academy of Sciences, Austria)
Location: BAR 218
| 09:50 | Enabling RDMA and GPUs in Rootless Kubernetes for Accelerated HPC and AI Applications (abstract) | 
10:00-10:30 Session 24: HeteroPar.1
Chair: 
José Cano (University of Glasgow, UK)
Location: BAR 205
| 10:00 | Open, cross-architecture acceleration of data analytics with SYCL and RISC-V (abstract) | 
10:30-11:00Coffee Break
11:00-12:30 Session 25A: PECS.1
Chair: 
Romolo Marotta (Università degli Studi Roma Tre, Italy)
Location: BAR 106
| 11:00 | Evaluating Energy Efficiency of Genomics Algorithms on Processing-in-Memory Architectures (abstract) | 
| 11:30 | SYCL for Energy-Efficient Computational Astrophysics: the case of DPEcho (abstract) | 
| 12:00 | Alumet: a modular framework to standardize the measurement of energy consumption (abstract) | 
11:00-12:30 Session 25B: HeteroPar.2
Chair: 
José Cano (University of Glasgow, UK)
Location: BAR 205
| 11:00 | Federated Learning in the Edge-Cloud Continuum: A Task-Based Approach with Colony (abstract) PRESENTER: Alessio Orsino | 
| 11:30 | OpenDwarfs 2025: Modernizing the OpenDwarfs Benchmark Suite for Heterogeneous Computing (abstract) | 
| 12:00 | Portable High-Performance Kernel Generation for a Computational Fluid Dynamics Code with DaCe (abstract) | 
11:00-12:00 Session 25C: VHPC.2
Chair: 
Michael Alexander (Austrian Academy of Sciences, Austria)
Location: BAR 218
| 11:00 | Performance Analysis of Container-in-VM Architectures: A Study on Hypervisor Isolation and Lightweight OS Integration (abstract) | 
| 11:30 | WebAssembly and Unikernels: A Comparative Study for Serverless at the Edge (abstract) | 
12:30-14:00Lunch Break
14:00-15:30 Session 26A: PECS.2
Chair: 
Romolo Marotta (Università degli Studi Roma Tre, Italy)
Location: BAR 106
| 14:00 | Mixed precision over GPU applied to a Microphysics model (abstract) | 
| 14:30 | Comparative Analysis of Energy Efficiency in Actor-Based Applications in Distributed Environments (abstract) | 
| 15:00 | HPC Benchmark Game: Comparing Programming Languages Regarding Energy-Efficiency for Applications from the HPC Field (abstract) | 
14:00-15:30 Session 26B: HeteroPar.3
Chair: 
José Cano (University of Glasgow, UK)
Location: BAR 205
| 14:00 | Cyclic Data Streaming on GPUs for Short Range Stencils Applied to Molecular Dynamics (abstract) PRESENTER: Martin Rose | 
| 14:30 | A Portable Branch-and-Bound Algorithm for Cross-Architecture Multi-GPU Systems (abstract) | 
| 15:00 | Tracking the Critical Path of Execution for GPU Offloading Applications (abstract) | 
15:30-16:00Coffee Break
16:00-17:00 Session 27A: PECS.3
Chair: 
Romolo Marotta (Università degli Studi Roma Tre, Italy)
Location: BAR 106
| 16:00 | Analysis of the carbon footprint of HPC (abstract) | 
| 16:30 | Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations (abstract) | 
16:00-17:30 Session 27B: HeteroPar.4
Chair: 
José Cano (University of Glasgow, UK)
| 16:00 | SIMON: A Simple Monitoring Framework for Heterogeneous Application Observability (abstract) | 
| 16:30 | Exploiting highly heterogenous systems with stencil applications (abstract) | 
| 17:00 | Green Energy Aware Scheduling of Scientific Workflows with Flexible Deadlines (abstract) | 
Wednesday, August 27th
View this program: with abstractssession overviewtalk overview
09:00-09:30 Session 30: Opening Session
Chair: 
Wolfgang E. Nagel (TU Dresden, Germany)
Location: BAR SCHÖ
09:30-10:30 Session 31: Keynote 1: Martin Schulz
Chair: 
Wolfgang E. Nagel (TU Dresden, Germany)
Location: BAR SCHÖ
10:30-11:00Coffee Break
11:00-12:30 Session 32A: Track 2.1: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows
Chair: 
Rosa María Badia (Barcelona Supercomputing Center, Spain)
Location: BAR 205
| 11:00 | ARC-V: Vertical Resource Adaptivity for HPC Workloads in Containerized Environments (abstract) PRESENTER: Jacob Wahlgren | 
| 11:20 | An Autonomy Loop for Dynamic HPC Job Time Limit Adjustment (abstract) PRESENTER: Thomas Jakobsche | 
| 11:40 | Enabling Elasticity in Scientific Workflows for High Performance Computing Systems (abstract) PRESENTER: Rajat Bhattarai | 
| 12:00 | WAPA: A Workload-Agnostic CPI-Based Thread-to-Core Allocation Policy (abstract) PRESENTER: Marta Navarro | 
11:00-12:30 Session 32B: Track 3.1: Neural Network Acceleration and Optimization
Chair: 
Dora Blanco (University of Santiago de Compostela, Spain)
Location: BAR 106
| 11:00 | FDHA: Fusion-Driven Heterogeneous Accelerator for Efficient Diffusion Model Inference (abstract) | 
| 11:20 | CoQMoE: Co-Designed Quantization and Computation Orchestration for Mixture-of-Experts Vision Transformer on FPGA (abstract) | 
| 11:40 | SkipNZ: Non-Zero Value Skipping for Efficient CNN Acceleration (abstract) PRESENTER: Jinhyeok Choi | 
| 12:00 | BATCH-DNN: Adaptive and Dynamic Batching for Multi-DNN Accelerators (abstract) PRESENTER: Piyumal Ranawaka | 
11:00-12:30 Session 32C: Track 6.1: Memory and I/O Systems
Chair: 
Bettina Schnor (University of Potsdam, Germany)
Location: BAR 218
| 11:00 | NetSenseML: Network-Adaptive Compression for Efficient Distributed Machine Learning (abstract) PRESENTER: Yisu Wang | 
| 11:20 | Breaking the I/O Barrier: 1.2 Tb/s Ethernet Packet Processing on a GPU (abstract) | 
| 11:40 | GECKO: A Write-optimized Hybrid Index based on Disaggregated Memory (abstract) | 
| 12:00 | Scalable OpenMP Remote Offloading via Asynchronous MPI and Coroutine-Driven Communication (abstract) PRESENTER: Jhonatan Cléto | 
11:00-12:30 Session 32D: PhD Symposium Poster Pitch Session
Chairs: 
Leonel Sousa (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal)
Michael Färber (ScaDS.AI & TU Dresden, Germany)
Michael Färber (ScaDS.AI & TU Dresden, Germany)
Location: BAR I88
12:30-14:00Lunch Break
14:00-15:00 Session 33A: Track 1.1: Performance Analysis and Simulation
Chair: 
Olaf Krzikalla (Deutsches Zentrum für Luft- und Raumfahrt (DLR), Germany)
Location: BAR 205
| 14:00 | Making MPI Collective Operations Visible: Understanding Their Utility and Algorithmic Insights (abstract) PRESENTER: Anna-Lena Roth | 
| 14:20 | TSim4CXL: Trace-driven Simulation Framework for CXL-based High-Performance Computing Systems (abstract) PRESENTER: Jaewoo Son | 
| 14:40 | THAPI: Tracing Heterogeneous APIs (abstract) PRESENTER: Brice Videau | 
14:00-15:00 Session 33B: Track 6.2: Learning systems
Chair: 
Salvador Petit (Universitat Politècnica de València, Spain)
Location: BAR 218
| 14:00 | SQ-DeAR: Sparsified and Quantized Gradient Compression for Distributed Training (abstract) PRESENTER: Xinrui Yang | 
| 14:20 | Accelerating Independent Multi-Agent Reinforcement Learning on Multi-GPU Platforms (abstract) PRESENTER: Samuel Wiggins | 
| 14:40 | ScheInfer: Efficient Inference of Large Language Models with Task Scheduling on Moderate GPUs. (abstract) PRESENTER: Wenxiang Lin | 
14:00-15:00 Session 33C: WHPC Special Session: Advances in HPC Computing Applications
Chair: 
Neda Ebrahimi Pour (German Aerospace Center (DLR), Germany)
Location: BAR 106
| 14:00 | Targeted data movement optimizations for emerging heterogeneous supercomputers (abstract) | 
| 14:20 | Efficient Anisotropic Mesh Refinement with Omnitrees ...or How to Get Cat GIFs Into Your Paper (abstract) | 
| 14:40 | From Reactive Debugging to Proactive Detection: AI for Performance-Aware Software Development (abstract) | 
15:00-16:00Coffee Break and PhD Symposium and Poster&Demos Session
The PhD Symposium Posters and the Posters & Demos will be on display in this coffee break.
15:00-16:00 Session 34A: Demos&Poster Session during the Coffee Break
Chairs: 
| Optimized Parallel Metaheuristics for Big Data Processing on GPUs with Apache Spark (abstract) | 
| Portable and Scalable FPGA Emulation of a Massive-Parallel Vector Processor (abstract) PRESENTER: Gia Bao Thieu | 
| Modifying the HyperLedger Fabric Blockchain Architecture to increase throughput and decrease transaction rejections (abstract) | 
| Time-related effects in the measurement of energy consumption in evolutionary algorithms (abstract) | 
| ParSolGen (Parallel Solvers Generator) - an automated numerical parallel programs generator for distributed memory parallel computers (abstract) | 
| Towards Digital Twins of HPC Data Centres Modelling Infrastructure and HPC Systems for IT-Zauber (abstract) | 
| Fault-Tolerant Distributed Federated Learning with Adaptive Termination Detection (abstract) | 
| H2O: Holistic Hyper-Parameter Optimization for Large-Scale Deep Neural Network Training (abstract) | 
15:00-16:00 Session 34B: PhD Symposium Poster Session during the Coffee Break
Chairs: 
Leonel Sousa (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Portugal)
Michael Färber (ScaDS.AI & TU Dresden, Germany)
Michael Färber (ScaDS.AI & TU Dresden, Germany)
| Power Scheduling on Multicore Multiprocessor Systems for Maximizing Throughput and Fairness (abstract) | 
| Accelerating Gate Sizing using GPU (abstract) PRESENTER: Yi-Hua Chung | 
| SCOPE: Accelerating ML data pipeline using cloud-based computational storage (abstract) | 
| Advanced Techniques in Polyhedral Model-Based Compilers for Efficient and Cross-Platform Code Generation on Multicore Processors (abstract) | 
| CoreWaterfall: a Virtual-Core-Focused Scheduling and Allocation Algorithm for Oversubscribed Virtual Machines (abstract) | 
| On-the-fly Performance Analysis of Asynchronous Parallel Execution (abstract) | 
| TH-Pulse: A Study on Hardware-Software Co-Designed Framework for LLM Training and Inference on the Tianhe new-generation supercomputer (abstract) | 
| DCG-DDQ: A Directed Cyclic Graph Based Task Computing System (abstract) | 
| A Hybrid DMA-Cache Mechanism to Leverage Memory Bandwidth in Massive-Parallel Processors (abstract) PRESENTER: Gia Bao Thieu | 
| Boosting Performance of Counting Queries in Machine Learning Applications with a ccNUMA-aware Implementation (abstract) | 
| EAGER: Energy-Aware 3D Gaussian Splatting on Embedded Parallel Heterogeneous Systems (abstract) PRESENTER: Oscar Ferraz | 
| AskLLVM: LLVM Code Generation for GPUs for Graph Algorithms (abstract) | 
| Heterogeneous computing, storage and network infrastructures for medical applications (abstract) | 
16:00-17:30 Session 35A: Track 2.2: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows
Chair: 
Domenico Talia (University of Calabria, Italy)
Location: BAR 205
| 16:00 | HAS-GPU: Efficient Hybrid Auto-scaling with Fine-grained GPU Allocation for SLO-aware Serverless Inferences (abstract) PRESENTER: Jianfeng Gu | 
| 16:20 | CGP-Graphless: Towards Efficient Serverless Graph Processing via CPU-GPU Pipelined Collaboration (abstract) PRESENTER: Yiming Sun | 
| 16:40 | Design and Operation of Elastic GPU-pooling on Campus (abstract) | 
| 17:00 | ServerlessRec: Fast Serverless Inference for Embedding-based Recommender Systems with Disaggregated Memory (abstract) | 
16:00-17:30 Session 35B: Track 6.3: Stream, Image and Sequence Processing
Chair: 
Daniel Cordeiro (University of São Paulo, Brazil)
Location: BAR 218
| 16:00 | SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure (abstract) PRESENTER: Apurv Deepak Kulkarni | 
| 16:20 | SWBWA: A Highly Efficient NGS Aligner on the New Sunway Architecture (abstract) PRESENTER: Lifeng Yan | 
| 16:40 | Efficient Pyramidal Analysis of Gigapixel Images on a Decentralized Modest Computer Cluster (abstract) | 
16:00-17:30 Session 35C: WHPC Special Session: Advances in HPC Computing Applications
Chair: 
Neda Ebrahimi Pour (German Aerospace Center (DLR), Germany)
Location: BAR 106
| 16:00 | Performance optimization of GROMACS on modern Hardware (abstract) | 
| 16:20 | FLEXI: Scale-resolving simulations of compressible turbulence on modern HPC systems (abstract) | 
| 16:40 | Exploring Flow Fields at Scale: GPU-Accelerated Scientific Visualization for Exascale CFD (abstract) | 
| 17:00 | In-Situ Techniques for the Efficient Coupling of Complex Plasma Turbulence Simulations: GENE and GENE-X (abstract) | 
Thursday, August 28th
View this program: with abstractssession overviewtalk overview
09:00-10:00 Session 36: Keynote 2: Domenico Talia
Chair: 
Christian Lengauer (University of Passau, Germany)
Location: BAR SCHÖ
10:00-10:30Coffee Break
10:30-12:30 Session 37: Best Paper Session
Chair: 
Thomas Ludwig (DKRZ, Germany)
Location: BAR SCHÖ
| 10:30 | Noise injection for performance bottleneck analysis (abstract) PRESENTER: Aurélien Delval | 
| 10:50 | Approximation Bounds for SLACK on Identical Parallel Machines (abstract) PRESENTER: Anthony Dugois | 
| 11:10 | SimPoint+: More Stable, Accurate and Efficient Program Analysis (abstract) PRESENTER: Ruini Xue | 
| 11:30 | AlphaSparseTensor: Discovering Faster Sparse Matrix Multiplication Algorithms on GPUs for LLM Inference (abstract) PRESENTER: Xuanzheng Wang | 
| 11:50 | Wedge-Parallel Triangle Counting for GPUs (abstract) PRESENTER: Jeffrey Spaan | 
| 12:10 | External GPU Biconnected Components (abstract) PRESENTER: Abhijeet Sahu | 
12:30-14:00Lunch Break
14:00-15:30 Session 38A: Track 1.2: Compilers, Optimizations, and Scheduling
Chair: 
Lars Schütze (TU Dresden, Germany)
Location: BAR 205
| 14:00 | CoSF: A Co-Optimization Framework for Operator Splitting and Fusion (abstract) PRESENTER: Wei Li | 
| 14:20 | Scalable Code Generation for RTL Simulation of Deep Learning Accelerators with MLIR (abstract) PRESENTER: Yi-Hua Chung | 
| 14:40 | Scheduling Task and Data Parallelism in Array Languages with Work Assisting (abstract) PRESENTER: Ivo Gabe de Wolff | 
| 15:00 | Polymorphic Higher-Order GPU Kernels (abstract) PRESENTER: Andre Rauber Du Bois | 
14:00-15:30 Session 38B: Track 4.1: Scalable AI Optimization and Parallel Training
Chair: 
Alessio Orsino (University of Calabria, Italy)
Location: BAR 106
| 14:00 | Saving Memory via Residual Reduction for DNN Training with Compressed Communication (abstract) PRESENTER: Xinjue Zheng | 
| 14:20 | Interval-Asynchrony: Delimited Intervals of Localised Asynchrony for Fast Parallel SGD (abstract) PRESENTER: Jacob Garby | 
| 14:40 | Robustness of deep learning classification to adversarial input on GPUs: asynchronous parallel accumulation is a source of vulnerability (abstract) | 
| 15:00 | Tutoring LLM into a Better CUDA Optimizer (abstract) PRESENTER: Martin Kruliš | 
14:00-15:30 Session 38C: Track 3.2: Architecture
Chair: 
Paul Kelly (Imperial College, UK)
Location: BAR 218
| 14:00 | ParTEE:A Framework for Secure Parallel Computing of RISC-V TEEs (abstract) PRESENTER: Hao Lan | 
| 14:20 | ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace (abstract) PRESENTER: Ruimin Shi | 
| 14:40 | CSGC: Collaborative File System Garbage Collection with Computational Storage (abstract) PRESENTER: Jin Pu | 
| 15:00 | SONet: Towards Practical Online Neural Network for Enhancing Hard-To-Predict Branches (abstract) PRESENTER: Zhenxuan Xiong | 
15:30-16:00Coffee Break
16:00-17:30 Session 39A: Track 3.3: Caching and Memory for ML
Chair: 
Fernando Silva (University of Porto, Portugal)
Location: BAR 106
| 16:00 | CacheC: LLM-based GPU Cache Management to Enhance Kernel Concurrency (abstract) | 
| 16:20 | Cocache: An Accurate And Low-overhead Dynamic Caching Method for GNNs (abstract) PRESENTER: Zhaoyang Zeng | 
| 16:40 | DCI: An Efficient Workload-Aware Dual-Cache Allocation GNN Inference Acceleration System (abstract) | 
| 17:00 | ReSpike: A Co-Design Framework for Evaluating SNNs on ReRAM-based Neuromorphic Processors (abstract) PRESENTER: Kazi Asifuzzaman | 
16:00-17:30 Session 39B: Track 2.3: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows
Chair: 
Alvaro Luiz Fazenda (Federal University of Sao Paulo (UNIFESP), Brazil)
Location: BAR 205
| 16:00 | MPLS: Stacking Diverse Layers into One Model for Decentralized Federated Learning (abstract) PRESENTER: Zhiwei Yao | 
| 16:20 | Federated Learning within Global Energy Budget over Heterogeneous Edge Accelerators (abstract) PRESENTER: Roopkatha Banerjee | 
| 16:40 | Auction-based Placement of Functions in the Fog at Scale (abstract) | 
| 17:00 | Bifröst: Peer-to-peer Load-balancing for Function Execution in Agentic AI Systems (abstract) | 
16:00-17:30 Session 39C: Track 4.2: Efficient AI Inference and Model Serving at Scale
Chair: 
Julio Sahuquillo (Universitat Politècnica de València, Spain)
Location: BAR 218
| 16:00 | TopServe: Task-Operator Co-Scheduling for Efficient Multi-DNN Inference Serving on GPUs (abstract) PRESENTER: Ao Chen | 
| 16:20 | EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse (abstract) PRESENTER: Tianyu Guo | 
| 16:40 | 2:4 Pruning on Edge Devices: Performance, Energy Efficiency and Accuracy (abstract) PRESENTER: Nicolás Hernández González | 
| 17:00 | Light-DiT: An Importance-Aware Dynamic Compression Framework for Diffusion Transformers (abstract) PRESENTER: Cheng Gu | 
Friday, August 29th
View this program: with abstractssession overviewtalk overview
09:00-10:00 Session 40: Keynote 3: Florina Ciorba
Chair: 
Diana Goehringer (TU Dresden, Germany)
Location: BAR SCHÖ
10:10-10:30Coffee Break
10:30-12:00 Session 42A: Track 2.4: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows
Chair: 
Carlos Barrios Hernandez (SC3UIS-CAGE, LIG/INRIA-DataMove, CITI/INRIA -Sindy, Colombia)
Location: BAR 205
| 10:30 | DynoInfer: Adaptive Resource Orchestration for LLM Inference on Resource-Constrained PCs (abstract) PRESENTER: Yunling Chen | 
| 10:50 | Container Workload Prediction Using Deep Domain Adaptation in Transfer Learning (abstract) | 
| 11:10 | KarmaPM: Reward-Driven Power Manager (abstract) | 
| 11:30 | A Sparsity Predicting Approach for General Large Language Models via Activation Pattern Clustering (abstract) PRESENTER: Nobel Dhar | 
10:30-12:00 Session 42B: Track 4.3: Distributed systems, Compression, and Federated Applications
Chair: 
Josef Weidendorfer (Technical University of Munich, Germany)
Location: BAR 106
| 10:30 | DiffNO: Neural Operator Learning using Physically Structured Constrained Diffusion Model (abstract) | 
| 10:50 | Scalable Compression of Massive Data Collections on HPC Systems (abstract) PRESENTER: Loris Belcastro | 
| 11:10 | On-Device Federated Learning for Remote Alpine Livestock Monitoring (abstract) PRESENTER: Sabtain Ahmad | 
| 11:30 | IAUG: Accelerating Augmentation with Importance Sampling in Deep Neural Network Training (abstract) PRESENTER: Germaine Nyatsikor | 
10:30-12:00 Session 42C: Track 5.1: Theory and Algorithms
Chair: 
Jože M. Rožanec (Jožef Stefan Institute, Slovenia)
Location: BAR 218
| 10:30 | Cache Management for Mixture-of-Experts LLMs (abstract) PRESENTER: Adrien Obrecht | 
| 10:50 | Near-optimal contraction strategies for the scalar product in the tensor-train format (abstract) PRESENTER: Atte Torri | 
| 11:10 | Supervised Distributed Computing (abstract) PRESENTER: Julian Werthmann | 
| 11:30 | Partial Detectors Versus Replication To Cope With Silent Errors (abstract) PRESENTER: Alix Tremodeux | 
10:30-12:00 Session 42D: Track 6.4: Graph Algorithms and Linear Algebra
Chair: 
Achim Basermann (German Aerospace Center (DLR), Simulation and Software Technology, Germany)
Location: BAR I88
| 10:30 | Uniform Dense Blocking for Efficient Sparse LU Factorization in First-principles Materials Simulation (abstract) PRESENTER: Chao Wang | 
| 10:50 | Efficient Task Graph Scheduling for Parallel QR Factorization in SLSQP (abstract) | 
| 11:10 | ScaleRunner: A Fast MPI-based Random Walk Engine for Multi-CPU Systems (abstract) PRESENTER: Florian Willich | 
12:00-13:30Lunch Break
13:30-14:30 Session 43A: Track 2.5: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows
Chair: 
Martin Schulz (Technical University Munich, Germany)
Location: BAR 205
| 13:30 | Leveraging Expert Usage to Speed up LLM Inference with Expert Parallelism (abstract) PRESENTER: Olivier Beaumont | 
| 13:50 | Priority-BF: a Task Manager for Priority-Based Scheduling (abstract) PRESENTER: Ana Gainaru | 
| 14:10 | Green Scheduling on the Edge (abstract) PRESENTER: Joachim Cendrier | 
13:30-14:30 Session 43B: Track 5.2: Theory and Algorithms
Chair: 
Lester Kalms (TU Dresden, Germany)
Location: BAR 218
| 13:30 | Byzantine-Tolerant Consensus in GPU-Inspired Shared Memory (abstract) | 
| 13:50 | Partitioning In-Place on Massively Parallel Systems (abstract) | 
13:30-14:30 Session 43C: Track 6.5: GPU and Quantum Systems
Chair: 
Florina M. Ciorba (University of Basel, Switzerland)
Location: BAR 106
| 13:30 | Disaggregated Design for GPU-Based Volumetric Data Structures (abstract) PRESENTER: Massimiliano Meneghin | 
| 13:50 | Quantum Delta Encoding: Optimizing Data Storage on Quantum Computers with Resource Efficiency (abstract) PRESENTER: Jiale Zhang | 
| 14:10 | SimPart: A Simple Yet Effective Replication-aided Partitioning Algorithm for Logic Simulation on GPU (abstract) PRESENTER: Yi-Hua Chung | 
14:30-15:00Closing Coffee Break