Program

PROGRAM

Days: Wednesday, August 28th Thursday, August 29th Friday, August 30th

Wednesday, August 28th

View this program: with abstracts session overview talk overview

09:00-09:30 Session 1

Chairs:

Sameer Shende, Fernando Silva, Javier Garcia Blas and Jesus Carretero

09:30-10:30 Session 2: Keynote: Mateo Valero

European supercomputers: buying versus building

In 2017, Europe created the EuroHPC initiative and its associated legal funding structure, the “EuroHPC JU” Joint Undertaking with two main objectives. The first objective is to acquire, build and deploy world-class high performance computing (HPC) infrastructure across Europe. The second objective is to conduct research and development to build HPC hardware manufactured in Europe, as well as the applications (software) that would run on future locally developed European supercomputers.This talk will cover both objectives in detail. On the one hand, Europe has recently committed a substantial amount of money to the first goal. For example, in the June 2024 Top-500 list, 9 of the Top-20 supercomputers are from Europe. We will go deeper and describe the two main components of the heterogeneous MareNostrum 5 supercomputer, listed separately in positions 8 and 22 of the June 2024 Top-500. Installed at our Barcelona site, MareNostrum 5 represents a good illustration of the challenges of building a contemporary supercomputer; for example, space requirements dictated that BSC could no longer implement it within our Church. Therefore, the MareNostrum 5 had to be installed in a larger space; while the Church will be used to install our first Quantum Computer, thus fulfilling the prophecy made by Dan Brown in his book "Origin".

On the other hand, and as the second part of my talk, I will describe the European approach to design general Made-in-Europe processors and accelerators leveraging the RISC V Open Instruction Set Architecture (ISA). Currently, this approach is embodied in a couple of large-scale European research projects, namely the European Processor Initiative, EUPilot, Eprocessor, as well as some nationally funded projects. I will briefly describe these projects, including the proof-of-concept chips that successfully boot Linux. I will briefly hint at the future and describe the initiatives that Europe and the BSC are pursuing with the main goal of developing software and hardware for the MareNostrum 6 supercomputer that should be a reality in 2027-2028.

Chair:

Jesus Carretero

Location: Auditorium

10:30-11:00Coffee Break

11:00-13:00 Session 3: Best Paper Session

Chair:

Fernando Silva

Location: Auditorium

11:00	Milo Lurati, Stijn Heldens, Alessio Sclocco and Ben van Werkhoven Bringing auto-tuning to HIP: Analysis of tuning impact and difficulty on AMD and Nvidia GPUs (Artifact) (abstract) PRESENTER: Stijn Heldens
11:20	Olivier Beaumont, Rémi Bouzel, Lionel Eyraud-Dubois, Esragul Korkmaz, Laercio Pilla and Alexandre Van Kempen A 1.25(1+ε)-Approximation Algorithm for Scheduling with Rejection Costs Proportional to Processing Times (Artifact) (abstract) PRESENTER: Esragul Korkmaz
11:40	Hamidreza Ramezanikebrya and Matei Ripeanu (re)Assessing PiM Effectiveness for Sequence Alignment (abstract) PRESENTER: Matei Ripeanu
12:00	Thorsten Wittkopp, Philipp Wiesner and Odej Kao LogRCA: Log-based Root Cause Analysis for Distributed Services (abstract) PRESENTER: Thorsten Wittkopp
12:20	Kåre von Geijer and Philippas Tsigas How to Relax Instantly: Elastic Relaxation of Concurrent Data Structures (Artifact) (abstract) PRESENTER: Kåre von Geijer
12:40	Richard Angersbach, Sebastian Kuckuk and Harald Köstler Code Generation for Octree-Based Multigrid Solvers with Fused Higher-Order Interpolation and Communication (abstract) PRESENTER: Richard Angersbach

13:00-14:00Lunch Break

14:00-15:00 Session 4: Latinamerica EU HPC Cooperation Panel

Chair:

Carlos J. Barrios

Location: Auditorium

15:00-15:30Coffee Break

15:30-17:30 Session 5A: Architectures and Accelerators (I)

Chair:

Taisuke Boku

Location: -1.A.04

15:30	Marius Meyer, Tobias Kenter, Kenneth O'Brien, Lucian Petrica, Michaela Blott and Christian Plessl Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL (abstract) PRESENTER: Marius Meyer
15:50	Keegan Sanchez, Alex Gavin, Suren Byna, Kesheng Wu and Xuechen Zhang A High-Performance Collective I/O Framework Leveraging Node-Local Persistent Memory (abstract) PRESENTER: Xuechen Zhang
16:10	Yunkun Liao, Jingya Wu, Wenyan Lu, Xiaowei Li and Guihai Yan Efficient RNIC Cache Side-channel Attack Detection through DPU-driven Architecture (abstract) PRESENTER: Yunkun Liao
16:30	Pedro Rigon, Brenda Schussler, Alexandre Sardinha, Pedro Mario Silva, Fábio Alves de Oliveira, Alexandre Carissimi, Jairo Panetta, Arthur Lorenzon and Philippe Navaux Harnessing Data Movement Strategies to Optimize Performance-Energy Efficiency of Oil & Gas Simulations in HPC (abstract) PRESENTER: Arthur Lorenzon

15:30-17:30 Session 5B: Theory and Algorithms (I)

Chair:

Javier Garcia Blas

Location: -1.A.07

15:30	Andrzej Lingas Boolean Matrix Multiplication for Highly Clustered Data on the Congested Clique (abstract)
15:50	Eunji Lee, Yoonsang Han and Gordon Moon Accelerated Block-Sparsity-Aware Matrix Reordering for Leveraging Tensor Cores in Sparse Matrix-Multivector Multiplication (Artifact) (abstract) PRESENTER: Eunji Lee
16:10	Roy Nissim, Oded Schwartz and Yuval Spiizer Communication Minimizing Toom-Cook Algorithms (abstract) PRESENTER: Yuval Spiizer
16:30	Stef Graillat, Fabienne Jézéquel, Théo Mary, Roméo Molina and Daichi Mukunoki Reduced-Precision and Reduced-Exponent Formats for Adaptive-Precision Sparse Matrix-Vector Product (abstract) PRESENTER: Roméo Molina

15:30-17:30 Session 5C: Multidisciplinary, Domain-Specific and Applied Parallel and Distributed Computing (I)

Chair:

Thomas Ludwig

Location: -1.A.01

15:30	Jiajun Song, Jiajun Luo, Rongwei Lu, Shuzhao Xie, Bin Chen and Zhi Wang A Joint Approach to Local Updating and Gradient Compression for Efficient Asynchronous Federated Learning (abstract) PRESENTER: Jiajun Song
15:50	Cristian Tatu, Javier Conejero, Fernando Vazquez and Rosa M. Badia GPU Cache System for COMPSs: A Task-Based Distributed Computing Framework (abstract) PRESENTER: Cristian Tatu
16:10	Zhuoyao Huang, Nan Zhang, Jingran Shen, Georgios Diamantopoulos, Zhengchang Hua, Nikos Tziritas and Georgios Theodoropoulos Distributed Simulation for Digital Twins of Large-Scale Real-World DiffServ-Based Networks (abstract) PRESENTER: Nan Zhang
16:30	Subhajit Sahu, Kishore Kothapalli, Hemalatha Eedi and Sathya Peri DF* PageRank: Incrementally Expanding Approaches for Updating PageRank on Dynamic Graphs (Artifact) (abstract) PRESENTER: Subhajit Sahu
16:50	Tiago Carneiro, Engin Kayraklioglu, Guillaume Helbecque and Nouredine Melab Investigating Portability in Chapel for Tree-based Optimization on GPU-powered Clusters (abstract) PRESENTER: Tiago Carneiro

15:30-17:30 Session 5D: Data analytics, AI, and Computational Science (I)

Chair:

Domenico Talia

Location: -1.A.06

15:30	Krishna Teja Chitty-Venkata, Sanjif Shanmugavelu, Varuni Katti Sastry, Murali Emani, Venkatram Vishwanath and Sylvia Howland WActiGrad: Structured Pruning for Efficient Finetuning and Inference of Large Language Models on AI Accelerators (abstract) PRESENTER: Murali Emani
15:50	Hewang Nie, Songfeng Lu, Mu Wang, Jue Xiao, Zhi Lu and Zepu Yi VeriChroma: Ownership Verification for Federated Models via RGB Filters (abstract) PRESENTER: Zhi Lu
16:10	Yuxiang Zhang, Xin Liu, Meng Wu, Mingyu Yan, Wei Yan, Xiaochun Ye and Dongrui Fan Disttack: Graph Adversarial Attacks Toward Distributed GNN Training (abstract) PRESENTER: Yuxiang Zhang
16:30	Pranjal Naman and Yogesh Simmhan Optimizing Federated Learning using Remote Embeddings for Graph Neural Networks (abstract) PRESENTER: Pranjal Naman
16:50	Guangyao Zhou, Haocheng Lan, Yuanlun Xie, Wenhong Tian, Jiahong Qian and Teng Su CSIMD: Cross-Search Algorithm with Improved Multi-Dimensional Dichotomy for Micro-batch-based Pipeline Parallel Training in DNN (abstract) PRESENTER: Guangyao Zhou

20:30-22:30 Welcome Reception

Thursday, August 29th

View this program: with abstracts session overview talk overview

09:00-10:00 Session 6: Keynote: Franck Cappello

AuroraGPT: Rationale, Challenges and Development of an AI Research Assistant

Innovative methods, new instruments, disruptive techniques, and groundbreaking technologies have led to significant leaps in scientific progress. The increasingly powerful Large Language Models (LLMs) released each month have already sped up research activities such as concept explanation, literature search, and summarization. The transformative potential of AI in research activities, in particular, foundation models, raises important questions about their performance in science activities, their potential application in different contexts, and their ethics. In this talk, I will first explore the notion of AI research assistants and then discuss the gap between an ideal AI research assistant and the current LLMs, focusing on HPC and parallel computing research problems. The gap motivates the development of research-oriented LLMs. AuroraGPT is developed as an open foundation model trained specifically with scientific data to explore solutions toward the realization of effective AI research assistants. I will describe the activity, challenges, and progress of the different groups developing the key aspects of AuroraGPT. I will particularly focus on the critical and hard task of LLMs' scientific skills, safety, and trustworthiness evaluation.

Chair:

Christian Lengauer

Location: Auditorium

10:00-10:30Coffee Break

10:30-12:30 Session 7A: Architectures and Accelerators (II)

Chair:

Jean-Baptiste Besnard

Location: -1.A.04

10:30	Xingbin Wang, Dan Meng and Rui Hou FakeGuard: A Novel Accelerator Architecture for Deepfake Detection Networks (abstract) PRESENTER: Xingbin Wang
10:50	Hongbing Tan, Xiaowei He, Libo Huang, Guichu Sun, Yuanhu Cheng, Jing Zhang, Zhong Zheng, Quan Deng, Bingcai Sui, Yongwen Wang and Liquan Xiao ImSPU: Implicit Sharing of Computation Resources between Vector and Scalar Processing Units (abstract) PRESENTER: Hongbing Tan
11:10	Dengke Han, Meng Wu, Runzhen Xue, Mingyu Yan, Xiaochun Ye and Dongrui Fan ADE-HGNN: Accelerating HGNNs through Attention Disparity Exploitation (abstract) PRESENTER: Dengke Han
11:30	Dario Muñoz-Muñoz, Félix García-Carballeira, Diego Camarmas-Alonso, Alejandro Calderón-Mateos and Jesús Carretero Fault tolerant in the Expand Ad-Hoc parallel file system (Artifact) (abstract) PRESENTER: Dario Muñoz-Muñoz
11:50	Jonas Hahnfeld, Jakob Blomer and Thorsten Kollegger Parallel Writing of Nested Data in Columnar Formats (Artifact) (abstract) PRESENTER: Jonas Hahnfeld

10:30-12:30 Session 7B: Data analytics, AI, and Computational Science (II)

Chair:

Tomas Margalef

Location: -1.A.06

10:30	Yuhang Li, Tong Liu, Wenfeng Shen, Yangguang Cui and Weijia Lu Improving Generalization and Personalization in Long-Tailed Federated Learning via Classifier Retraining (abstract) PRESENTER: Yuhang Li
10:50	Mengde Zhu, Wanyi Ning, Qi Qi, Jingyu Wang, Zirui Zhuang, Haifeng Sun, Jun Huang and Jianxin Liao FLUK: Protecting Federated Learning against Malicious Clients for Internet of Vehicles (abstract) PRESENTER: Mengde Zhu
11:10	Haoran Dang, Meng Wu, Mingyu Yan, Xiaochun Ye and Dongrui Fan GDL-GNN: Applying GPU Dataloading of Large Datasets for Graph Neural Network Inference (abstract) PRESENTER: Haoran Dang
11:30	Weigang Zhang, Biyu Zhou, Xing Wu, Chaochen Gao, Zhibing Liu, Xuehai Tang, Ruixuan Li, Jizhong Han and Songlin Hu Quartet: A Holistic Hybrid Parallel Framework for Training Large Language Models (abstract) PRESENTER: Weigang Zhang
11:50	Héctor Martínez, Francisco D. Igual, Rafael Rodríguez-Sánchez, Sandra Catalan, Adrián Castelló and Enrique S. Quintana-Orti Inference with Transformer Encoders on ARM and RISC-V Multicore Processors (abstract) PRESENTER: Enrique S. Quintana-Orti

10:30-12:30 Session 7C: Multidisciplinary, Domain-Specific and Applied Parallel and Distributed Computing (II)

Chair:

Dora Blanco

Location: Auditorium

10:30	Júnior Löff, Dalvan Griebler, Luiz Gustavo Fernandes and Walter Binder MPR: An MPI Framework for Distributed Self-Adaptive Stream Processing (abstract) PRESENTER: Júnior Löff
10:50	Dian-Lun Lin, Joshua San Miguel, Umit Ogras and Tsung-Wei Huang TaroRTL: Accelerating RTL Simulation using Coroutine-based Heterogeneous Task Graph Scheduling (abstract) PRESENTER: Tsung-Wei Huang
11:10	Thiago Maltempi, Sandro Rigo, Marcio Pereira, Hervé Yviquel, Jessé Costa and Guido Araujo Combining Compression and Prefetching to Improve Checkpointing for Inverse Seismic Problems in GPUs (abstract) PRESENTER: Thiago Maltempi
11:30	Andoni Salcedo Navarro, Raúl Peña Ortiz, José M. Claver, Miguel Garcia Pineda and Juan Gutiérrez-Aguado Cloud-native GPU-enabled architecture for parallel video encoding (abstract) PRESENTER: Andoni Salcedo Navarro
11:50	Xiaokang Fan, Zhen Ge, Sifan Long, Tao Tang, Chun Huang, Lin Peng and Canqun Yang VLASPH: Smoothed Particle Hydrodynamics on VLA SIMD Architectures (abstract) PRESENTER: Xiaokang Fan

10:30-12:30 Session 7D: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows (I)

Chair:

Wolfgang Nagel

Location: -1.A.01

10:30	Louis-Claude Canon, Anthony Dugois and Loris Marchal Solving the Restricted Assignment Problem to Schedule Multi-Get Requests in Key-Value Stores (Artifact) (abstract) PRESENTER: Anthony Dugois
10:50	Sixing Yu, Pablo Munoz and Ali Jannesari Resource-Aware Heterogeneous Federated Learning with Specialized Local Models (abstract) PRESENTER: Sixing Yu
11:10	Vincent Fagnon, Giorgio Lucarelli and Christophe Rapine Makespan Minimization for Scheduling on Heterogeneous Platforms with Precedence Constraints (abstract) PRESENTER: Giorgio Lucarelli
11:30	Zhengda Wu, Yixiao Feng, Mingtai Lv, Sining Yang and Bo Zhang Deadline-driven Enhancements and Response Time Analysis of ROS2 Multi-threaded Executors (abstract) PRESENTER: Zhengda Wu
11:50	Danilo Carastan-Santos, Georges Da Costa, Millian Poquet, Patricia Stolf and Denis Trystram Light-weight prediction for improving energy consumption in HPC platforms (Artifact) (abstract) PRESENTER: Danilo Carastan-Santos

12:30-13:30Lunch Break

13:30-15:30 Session 8A: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows (II)

Chair:

Jean-Thomas Acquaviva

Location: -1.A.01

13:30	Jiazhi Jiang, Hongbin Zhang, Deyin Liu, Jiangsu Du, Xiaojiao Yao, Jinhui Wei, Pin Chen, Dan Huang and Yutong Lu Efficient Coupling Streaming AI and Ensemble Simulations on HPC Clusters (abstract) PRESENTER: Dan Huang
13:50	Filip Mikina, Paweł Żuk and Krzysztof Rzadca sAirflow: Adopting Serverless in a Legacy Workflow Scheduler (abstract) PRESENTER: Paweł Żuk
14:10	Farah Ait Salaht, Nora Izri and Maher Rebai Optimizing Service Replication and Placement for IoT Applications in Fog Computing Systems (abstract) PRESENTER: Farah Ait Salaht
14:30	Alexis Bandet, Francieli Boito and Guillaume Pallez Scheduling distributed I/O resources in HPC systems (abstract) PRESENTER: Alexis Bandet
14:50	Qian Yang, Xuyan Jiang, Wei Quan, Rulin Liu and Zhigang Sun Node Bundle Scheduling: An Ultra-Low Latency Traffic Scheduling Algorithm for TAS-based Time-Sensitive Networks (abstract) PRESENTER: Qian Yang

13:30-15:30 Session 8B: Architectures and Accelerators (III)

Chair:

Enrique S. Quintana-Orti

Location: -1.A.05

13:30	Steef Hegeman, Daan Wöltgens, Anton Wijs and Alfons Laarman Compact Parallel Hash Tables on the GPU (Artifact) (abstract) PRESENTER: Steef Hegeman
13:50	Gabriel Gomez-Lopez, Miguel Sánchez de la Rosa, Jesus Escudero-Sahuquillo, Pedro Javier Garcia, Francisco J. Quiles and Pierre-Axel Lagadec Hybrid Congestion Control for BXI-based Interconnection Networks (abstract) PRESENTER: Gabriel Gomez-Lopez
14:10	Stepan Nassyr and Dirk Pleiter Exploring processor micro-architectures optimised for BLAS3 micro-kernels (abstract) PRESENTER: Stepan Nassyr
14:30	Xuan Zhang, Zhuoran Song, Fangxin Liu, Zhezhi He, Li Jiang and Xiaoyao Liang Watt: A Write-optimized RRAM-based Accelerator for Attention (abstract) PRESENTER: Xuan Zhang

13:30-15:30 Session 8C: Theory and Algorithms (II)

Chair:

Philippe Navaux

Location: -1.A.06

13:30	Ivo Gabe de Wolff, Daniel Anderson, Gabriele K. Keller and Aleksei Seletskiy A Fast Wait-Free Solution to Read-Reclaim Races in Reference Counting (Artifact) (abstract) PRESENTER: Ivo Gabe de Wolff
13:50	Qasim Abbas, Mohsen Koohi Esfahani, Ian Overton and Hans Vandierendonck QClique: Optimizing Performance and Accuracy in Maximum Weighted Clique (abstract) PRESENTER: Qasim Abbas
14:10	Sharon Boddu and Maleq Khan ALZI: An Improved Parallel Algorithm for Finding Connected Components in Large Graphs (abstract) PRESENTER: Maleq Khan
14:30	Matthieu Robeyns, Marc Baboulin, Simplice Donfack, Oguz Kaya and Theo Mary Mixed precision randomized low-rank approximation with GPU tensor cores (abstract) PRESENTER: Matthieu Robeyns
14:50	Filippo Ziche, Federico Busato, Rosalba Giugno and Nicola Bombieri GPU-Accelerated BFS for Dynamic Networks (abstract) PRESENTER: Filippo Ziche

13:30-15:30 Session 8D: Programming, Compilers and Performance (II)

Chair:

Cristina Silvano

Location: -1.A.04

13:30	Greg Henry, Eric Petit, Alexander Lyashevsky and Peter Caday Deconstructing HPL-MxP benchmark: a numerical perspective (abstract) PRESENTER: Eric Petit
13:50	Le Chen, Arijit Bhattacharjee, Nesreen Ahmed, Niranjan Hasabnis, Gal Oren, Vy Vo and Ali Jannesari OMPGPT: A Generative Pre-trained Transformer Model for OpenMP (abstract) PRESENTER: Arijit Bhattacharjee
14:10	Bizhao Shi, Tuo Dai, Sunan Zou, Xinming Wei and Guojie Luo ImageMap: Enabling Efficient Mapping from Image Processing DSL to CGRA (abstract) PRESENTER: Bizhao Shi
14:30	Lucas Van Lanker, Hugo Taboada, Elisabeth Brunet and François Trahay Predicting GPU kernel's performance on upcoming architectures (abstract) PRESENTER: Lucas Van Lanker
14:50	Bengisu Elis, David Boehme, Olga Pearce and Martin Schulz A Mechanism to Generate Interception Based Tools for HPC Libraries (abstract) PRESENTER: Bengisu Elis

15:30-16:00Coffee Break

16:00-17:30 Session 9A: Data analytics, AI, and Computational Science (III)

Chair:

Carlos J. Barrios

Location: -1.A.06

16:00	Kohei Hiraga and Osamu Tatebe PEANUTS: A Persistent Memory-Based Network Unilateral Transfer System for Enhanced MPI-IO Data Transfer (Artifact) (abstract) PRESENTER: Kohei Hiraga
16:20	Lin Wang, Yuchong Hu, Yuxue Liu, Renzhi Xiao and Dan Feng Asymmetric Coded Distributed Computation for Resilient Prediction Serving Systems (abstract) PRESENTER: Lin Wang
16:40	Yunkun Liao, Hanyue Lin, Jingya Wu, Wenyan Lu, Huawei Li, Xiaowei Li and Guihai Yan Athena: Add More Intelligence to RMT-based Network Data Plane with Low-bit Quantization (abstract) PRESENTER: Yunkun Liao
17:00	Zhi Lu, Songfeng Lu, Yongquan Cui, Junjun Wu, Hewang Nie, Jue Xiao and Zepu Yi Lightweight Byzantine-Robust and Privacy-Preserving Federated Learning (abstract) PRESENTER: Zhi Lu
17:20	Jiguang Lv, Shuchun Xu, Xiaodong Zhan, Tao Liu, Dapeng Man and Wu Yang FedGG: Leveraging Generative Adversarial Networks and Gradient Smoothing for Privacy Protection in Federated Learning (abstract) PRESENTER: Shuchun Xu

16:00-17:30 Session 9B: Multidisciplinary, Domain-Specific and Applied Parallel and Distributed Computing (III)

Chair:

George Angelos Papadopoulos

Location: -1.A.01

16:00	Yuang Chen and Jeffery Xu Yu Vectorizing Sparse Blocks of Graph Matrices for SpMV (abstract) PRESENTER: Yuang Chen
16:20	L. Felipe Romero, Marcos Lupión Lorente, N. C. Cruz, Luis F. Romero and Pilar M. Ortigosa On the use of hybrid computing for accelerating EEG preprocessing (abstract) PRESENTER: L. Felipe Romero
16:40	Jie Jia, Yi Liu, Yifan Chen, Yanke Liu and Fang Lin AdapCK: Optimizing I/O for Checkpointing on Large-scale High Performance Computing Systems (abstract) PRESENTER: Jie Jia
17:00	Dazheng Liu, Xiaoli Ren, Jianping Wu, Wenjuan Liu, Juan Zhao and Shaoliang Peng Pipe-AGCM: A Fine-grain Pipelining Scheme for Optimizing the Parallel Atmospheric General Circulation Model (abstract) PRESENTER: Dazheng Liu

16:00-17:30 Session 9C: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows (III)

Chair:

Tommaso Cucinotta

Location: -1.A.05

16:00	Handong Luo, Wenhao Liu, Qi Zhang, Ziheng Yang, Quanwei Lin, Wenjun Zhu, Kun Qiu, Zhe Chen and Yue Gao Hurry: Dynamic Collaborative Framework For Low-orbit Mega-Constellation Data Downloading (abstract) PRESENTER: Handong Luo
16:20	Yibing Lin, Binbin Feng and Zhijun Ding Context-aware Runtime Type Prediction for Heterogeneous Microservices (abstract) PRESENTER: Yibing Lin
16:40	Yuandou Wang, Neel Kanwal, Kjersti Engan, Chunming Rong, Paola Grosso and Zhiming Zhao PriCE: Privacy-Preserving and Cost-Effective Scheduling for Parallelizing the Large Medical Image Processing Workflow over Hybrid Clouds (abstract) PRESENTER: Yuandou Wang
17:00	Yiming Yao, Yingwei Luo, Xiaolin Wang, Zhenlin Wang, Liujia Li, Jianyu Wu and Liren Zhu EKRM: Efficient Key-Value Retrieval Method to Reduce Data Lookup Overhead for Redis (abstract) PRESENTER: Yiming Yao

18:45-20:00 Walking Tour

Did you know that Madrid was a relatively small town, practically unknown outside of Spain before the year 1561? The city’s fortunes changed that year when it burst upon the scene of European politics by becoming the permanent capital of Spain. The dynasty at the head of this change was known as the Habsburgs, a family that ruled the country and much of the known world from the 16th to the 18th century, and who were referred to in Spain as the House of Austria.

The historic center of Madrid was built up predominantly during the reign of that same dynasty, and this fascinating walking tour of Madrid de los Austrias, takes you through that area, giving you the best introduction to the Spanish capital.

20:30-22:30 Gala Dinner

Friday, August 30th

View this program: with abstracts session overview talk overview

09:00-10:00 Session 10: Keynote: İlkay Altıntaş

Bridging the Data Gaps to Democratize AI in Science, Education and Society

The democratization of Artificial Intelligence (AI) necessitates an ecosystem where data and research infrastructure are seamlessly integrated and universally accessible. This talk overviews the imperative of bridging the gaps between these components through robust services, facilitating an inclusive AI landscape that empowers diverse research communities and domains. The National Data Platform (NDP) aims to lower the barriers to entry for AI research and applications through an integrated services approach to streamline AI workflows, from data acquisition to model deployment. This approach underscores the importance of open, extensible, and equitable systems in driving forward the capabilities of AI, ultimately contributing to the resolution of grand scientific and societal challenges. Through examining real case studies leveraging open data platforms and scalable research infrastructure, the talk will highlight the role of composable systems and services in NDP to catalyze a platform to empower users from all backgrounds to engage in meaningful research, learning, and discovery.

Chair:

Sameer Shende

Location: Auditorium

10:00-10:30Coffee Break

10:30-12:30 Session 11A: WHPC Session

Chair:

Marta Garcia

Location: Auditorium

10:30	Rosa Badía Making easier the life-cycle management of complex application workflows (abstract)
11:10	Serena Curzel Pre-Scheduling of Affine Loops for HLS Pipelining (abstract)
11:40	Marta Bertran Ferrer Evaluation of CPU constraining mechanisms in the LHC ALICE experiment Grid (abstract)

10:30-12:30 Session 11B: Industrial Session

Location: -1.A.01

10:30	Helena Vela Supporting HPC Centers: challenges, horror stories and best practices (abstract)
11:00	Sameer Shende ParaTools Pro for E4S (abstract)
11:30	Elisabetta Boella E4 at the forefront of European HPC (abstract)

12:30-13:30Lunch Break

13:30-14:50 Session 12A: Programming, Compilers and Performance (I)

Chair:

Javier Fernandez Muñoz

Location: -1.A.05

13:30	Suren Harutyunyan Gevorgyan, Anna Sikora, Eduardo Cesar, Jiří Filipovič, Akash Dutta, Ali Jannesari and Jordi Alcaraz Efficient Code Region Characterization through Automatic Performance Counters Reduction using Machine Learning Techniques (abstract) PRESENTER: Suren Harutyunyan Gevorgyan
13:50	Anju Mongandampulath Akathoott and Rupesh Nasre. FlexiGran: Flexible Granularity Locking in Hierarchies (abstract) PRESENTER: Anju Mongandampulath Akathoott
14:10	Mohammad Zubair and Christoph Bauinger ESIMD GPU implementations of Deep Learning Sparse Matrix Kernels (abstract) PRESENTER: Christoph Bauinger

13:30-14:50 Session 12B: Multidisciplinary, Domain-Specific and Applied Parallel and Distributed Computing (IV)

Chair:

Manuel Capel

Location: -1.A.01

13:30	Xianlong Zhou, Pei Li, Jiageng Chen and Shixiong Yao Accelerating Stencil Computation with Fully Homomorphic Encryption Using GPU (abstract) PRESENTER: Pei Li
13:50	Helena Schubert da Incarnacao Lima da Silva, Maria Clicia Stelling de Castro, Fabricio Alves Barbosa da Silva and Alba Cristina Magalhaes Alves de Melo A Framework for Automated Parallel Execution of Scientific Multi-Workflow Applications in the Cloud with Work Stealing (abstract) PRESENTER: Helena Schubert da Incarnacao Lima da Silva
14:10	Guofeng Feng, Hongyu Wang, Zhuoqiang Guo, Mingzhen Li, Tong Zhao, Zhou Jin, Weile Jia, Guangming Tan and Ninghui Sun Accelerating Large-Scale Sparse LU Factorization for RF Circuit Simulation (abstract) PRESENTER: Guofeng Feng

13:30-14:50 Session 12C: Scheduling, Resource Management, Cloud, Edge Computing, and Workflows (IV)

Chair:

Dominik Hubert

Location: -1.A.06

13:30	Zechun Zhou, Jingwei Sun, Hengquan Mei, Peng Sun and Guangzhong Sun DProbe: Profiling and Predicting Multi-Tenant Deep Learning Workloads for GPU Resource Scaling (abstract) PRESENTER: Zechun Zhou
13:50	Haibo Tang, Huan Zhang, Zhenyu Zhang, Zhao Zhang, Cheqing Jin and Aoying Zhou Towards High-Performance Transactions via Hierarchical Blockchain Sharding (abstract) PRESENTER: Haibo Tang
14:10	Tingkai Liu, Huili Tao, Yicheng Lu, Zhongbo Zhu, Marquita Ellis, Sara Kokkila-Schumacher and Volodymyr Kindratenko Automated Data Management and Learning-based Scheduling for Ray-based Hybrid HPC-Cloud Systems (abstract) PRESENTER: Tingkai Liu

13:30-14:50 Session 12D: Architectures and Accelerators (IV)

Chair:

Raffaele Montella

Location: -1.A.04

13:30	Mohammad Hafezan, Reza Jahadi and Ehsan Atoofian PCTC: Hardware and Software Co-Design for Pruned Capsule Networks on Tensor Cores (abstract) PRESENTER: Ehsan Atoofian
13:50	Chuhui Wang, Zewen Ye, Haibin Shen and Kejie Huang A Folded Computation-in-Memory Accelerator for Fast Polynomial Multiplication in BIKE (abstract) PRESENTER: Zewen Ye
14:10	Leandro Fiorin and Cristina Silvano MEPAD: A Memory-efficient Parallelized Direct Convolution Algorithm for Deep Neural Networks (abstract) PRESENTER: Leandro Fiorin

14:50-15:00 Session 13: Conference Closing

Chairs:

Jesus Carretero and Javier Garcia Blas

Location: Auditorium