ARCS 2024: 37ND GI/ITG INTERNATIONAL CONFERENCE ON ARCHITECTURE OF COMPUTING SYSTEMS 2024
PROGRAM FOR TUESDAY, MAY 14TH
Days:
next day
all days

View: session overviewtalk overview

13:15-14:15 Session 3: Keynote by Lutz Stobbe: "Strategies towards Green HPC – Environmental Analysis and Applied Ecodesign"

Abstract:

The total energy consumption and carbon footprint of data centers will substantially increase in the next years after more than a decade of relative slow growths. This trend has been indicated by a new study modelling the carbon footprint of ICT in Germany within the framework of the GreenICT@FMD framework project. The keynote has two objectives. The first objective of the keynote is to provide an insight into the methodological approach to the lifecycle environmental assessment of computer systems. In this context, the factors that lead to increasing environmental impacts in the production and use of enterprise computers are explained in detail. This includes, among other things, topics such as the environmental impact of semiconductor production in the context of Moore's Law and the topic of the use of renewable energies. The second objective of the keynote addresses the immediate factors that are currently contributing to a rapid increase in computer power consumption in data centers. Using the example of the current technical development of high-end CPUs, the declining of server-related energy efficiency is analyzed. Other topics include trends concerning chip cooling and waste heat utilization in data centers. The keynote intends to provide a holistic perspective on the environmental aspects of higher performing computer systems.

Bio:

Lutz Stobbe is a senior scientist at the Fraunhofer Institute for Reliability and Microintegration (IZM) with 25 years of work experience in Green information and communication technology (ICT). The research of his group, “Sustainable Networks and Computing'', is focused on the methodical issues of lifecycle assessment (LCA) and applied eco-design for data center and telecommunication equipment. He developed the 5C methodology, which supports a structured modelling of complex lifecycle inventories. As a project manager, he has been involved in dozens of national and international research projects. Most notably are six preparatory studies developing measures for the implementation of the EU Ecodesign Directive. This includes the ENTR Lot 9 study on enterprise servers and data storage equipment. He also led expert teams (Begleitforschung) accompanying large publicly funded research programs such as IT2Green, 5G Industrial Internet, Green HPC, and Green ICT. His work for industry includes trainings and consultations, focusing on applied LCA and eco-design.

Location: Room 3.06.H01
14:15-14:45Coffee Break
14:45-16:00 Session 4: Progress in Neural Networks
Location: Room 3.06.H01
14:45
nAIxt: A Light-Weight Processor Architecture for Efficient Computation of Neuron Models

ABSTRACT. The simulation of biological neural networks holds immense promise for advancing both neuroscience and artificial intelligence. Due to its high complexity, it requires powerful computers. However, the high proportion of communication and routing makes general-purpose pro- cessing architectures, as used in supercomputers, inefficient. Dedicated hardware, such as ASICs, on the other hand, can be specifically adapted to this type of workload. However, integrated circuits are rigid, thereby eliminating the use of future neuron models. To address this contradiction, this paper presents a programmable ar- chitecture for the computation of neuron models. Thanks to its Turing completeness, it enables embedding biological neural networks simula- tors into integrated circuits while simultaneously allowing adaptation of the neuron model. To assess suitability, both dedicated circuits and off- the-shelf processors are examined regarding AT efficiency. The proposed versatile architecture turns out to be up to 1800x more area efficient than a RISC-V processor, thereby playing a vital role in accelerating neuroscience simulation and research in AI.

15:10
An Approach Towards Distributed DNN Training on FPGA Clusters

ABSTRACT. We present NADA, a Network Attached Deep learning Accelerator. It provides a flexible hardware/software framework for training deep neural networks on ethernet-based FPGA clusters. The NADA hardware framework instantiates a dedicated entity for each layer. Features and gradients flow through these layers in a tightly pipelined manner. From a compact description of a model and target cluster, the NADA software framework generates specific configuration bitstreams for each particular FPGA in the cluster. We demonstrate the scalability and flexibility of our approach by mapping an example CNN onto a cluster consisting of three up to nine Intel Arria 10 FPGAs. To verify NADAs effectiveness for commonly used networks, we train MobileNetV2 on a six-node cluster. We address the inherent incompatibility of the tightly pipelined layer parallel approach with batch normalization by using online normalization instead.

15:35
The Power of Training: How Different Neural Network Setups Influence the Energy Demand

ABSTRACT. This work offers a heuristic evaluation of the effects of variations in machine learning training regimes and learning paradigms on the energy consumption of computing, especially HPC hardware with a life-cycle aware perspective. While increasing data availability and innovation in high-performance hardware fuels the training of sophisticated models, it also fosters the fading perception of energy consumption and carbon emission. Therefore, the goal of this work is to raise awareness about the energy impact of general training parameters and processes, from learning rate over batch size to knowledge transfer. Multiple setups with different hyperparameter configurations are evaluated on three different hardware systems. Among many results, we have found out that even with the same model and hardware to reach the same accuracy, improperly set training hyperparameters consume up to 5 times the energy of the optimal setup. We also extensively examined the energy-saving benefits of learning paradigms including recycling knowledge through pretraining and sharing knowledge through multitask training.

16:00-16:30Coffee Break
16:30-17:45 Session 5: Organic Computing I
Location: Room 3.06.H01
16:30
An Efficient Multi Quantile Regression Network with Ad Hoc Prevention of Quantile Crossing

ABSTRACT. This article presents the Sorting Composite Quantile Regression Neural Network (SCQRNN), an advanced quantile regression model designed to prevent quantile crossing and enhance computational efficiency. Integrating ad hoc sorting in training, the SCQRNN ensures non-intersecting quantiles, boosting model reliability and interpretability. We demonstrate that the SCQRNN not only prevents quantile crossing and reduces computational complexity but also achieves faster convergence than traditional models. This advancement meets the requirements of high-performance computing for sustainable, accurate computation. In organic computing, the SCQRNN enhances self-aware systems with predictive uncertainties, enriching applications across finance, meteorology, climate science, and engineering.

16:55
Modifiable Artificial DNA - Change your System’s ADNA at any Time

ABSTRACT. The Artificial DNA (ADNA) and the Artificial Hormone System (AHS) collectively constitute a middleware leveraging Organic Computing techniques to enhance the robustness and adaptability of distributed Embedded Systems. These systems then have the properties of self-organization, self-healing, self-configuration and self-improvement. However, the adaptability of the system is hindered by the rigidity of the ADNA, as it cannot be modified at runtime. Recent research approaches already assume such a modifiable ADNA in their applications without implementing it and evaluating its behavior. In this paper, such a runtime- modifiable ADNA is presented, its behavior is evaluated experimentally and placed in the context of the usual ADNA behavior.

17:20
From Structured to Unstructured: A Comparative Analysis of CV and Graph Models in solving Mesh-based PDEs

ABSTRACT. This article investigates the application of computer vision and graph-based models in solving mesh-based partial differential equations within high-performance computing environments. Focusing on structured, graded structured, and unstructured meshes, the study compares the performance and computational efficiency of three computer vision-based models against three graph-based models across three datasets. The research aims to identify the most suitable models for different mesh topographies, particularly highlighting the exploration of graded meshes, a less studied area. Results demonstrate that computer vision-based models, notably U-Net, outperform the graph models in prediction performance and efficiency in two (structured and graded) out of three mesh topographies. The study also reveals the unexpected effectiveness of computer vision-based models in handling unstructured meshes, suggesting a potential shift in methodological approaches for data-driven partial differential equation learning. The article underscores deep learning as a viable and potentially sustainable way to enhance traditional high-performance computing methods, advocating for informed model selection based on the topography of the mesh.