MODA21: Monitoring and Operational Data Analytics Workshop ISC High Performance Frankfurt, Germany, July 2, 2021 |
Conference website | https://moda21.sciencesconf.org |
Submission link | https://easychair.org/conferences/?conf=moda21 |
Abstract registration deadline | March 29, 2021 |
Submission deadline | April 5, 2021 |
The race to Exascale poses significant challenges for the collection and analysis of the vast amount of data that future HPC systems will produce, in terms of the increasing complexity of the machines, the scalability and intrusiveness of the adopted monitoring solution, and the interpretability and effective inference driven by the acquired data.
After a very successful first installment last year we are inviting contributions to the 2nd ISC-HPC International Workshop on Monitoring and Operational Data Analytics (MODA). The goal is to provide a venue for sharing insight into current trends in MODA, to identify potential gaps, and to offer an outlook into the future of the involved fields high performance computing, databases, machine learning, and possible solutions for upcoming Exascale systems. Contributions matching the scope of the workshop will be related to:
- Currently envisioned solutions and practices for monitoring systems at data centers and HPC sites. Significant focus will be placed on operational data collection mechanisms respectively i) covering different system levels, from building infrastructure sensor data to CPU-core performance metrics, and ii) targeting different end-users, from system administrators to application developers and computational scientists.
- Effective strategies for analyzing and interpreting the collected operational data. Such strategies should particularly include (but are not limited to) different visualization approaches and machine learning-based techniques, potentially inferring knowledge of the system behavior and allowing for the realization of a proactive control loop.
This workshop is not targeting new solutions proposed in the context of application performance modeling and/or application performance analysis tools. Novel contributions in the area of compiler analysis, debugging, programming models, and/or sustainability of scientific software are also considered out of the scope of the workshop.
While MODA is becoming common practice at various international HPC sites, each site adopts a different, insular approach, rarely adopted in production environments and mostly limited to the visualization of the system and building infrastructure metrics for health check purposes. In this regard, we observe a gap between the collection of operational data and its meaningful and effective analysis and exploitation, which prevents the closing of the feedback loop between the monitored HPC system, its operation, and its end-users. Under these premises, the goals of the workshop can then be summarized in the following way:
- Gather and share knowledge and establish a common ground within the international community with respect to best practices in monitoring and operational data analytics.
- Discuss future strategies and alternatives for MODA, potentially improving existing solutions and envisioning a common baseline approach in HPC sites and data centers.
- Establish a debate on the usefulness and applicability of AI techniques on collected operational data for optimizing the operation of production systems (e.g. for practices such as predictive maintenance, runtime optimization, optimal resource allocation and scheduling).
All papers must be original and not simultaneously submitted to another journal or conference. MODA21 welcomes full papers which will ideally address:
- State-of-the-practice method, tools, techniques in monitoring at various HPC sites
- Solutions for monitoring and analysis of operational data that work very well on large- to extreme-scale systems with a large number of users
- Solutions that have proven limitations in terms of efficiency of operational data collection in real-time or in terms of the quality of the collected data
- Opportunities and challenges of using machine learning methods for efficient monitoring and analysis of operational data
- Integration of monitoring and analysis practices into production system software (energy and resource management) and runtime systems (scheduling and resource allocation)
- Discuss explicit gaps between operational data collection, processing, effective analysis, highly useful exploitation, and propose new approaches to closing these gaps for the benefit of improving HPC center planning, operations, and research
- Other monitoring and operational data analysis challenges and approaches (data collection, storage, visualization, integration into system software, adoption)
- Means to identify misuse, intentional or unintentional, of resources, and methods to mitigate the effects of these: taking automatic steps to contain the effects of one application/job/user allocation on others, supporting users to identify causes for the misbehavior of their application, linking to intrusion detection and safe multitenancy.
- Concepts to integrate MODA into the system design at all levels, including dedicated hardware components, middleware features, and tool support that make ‘monitoring by default’ a viable option without sacrificing performance.
- FAIR data practices, including sharing of monitoring workflows and tools across sites while ensuring compliance with GDPR regulations and user access agreements.
Submission Guidelines
We will solicit original contributions in the form of full papers (6-12 pages) which will be peer-reviewed by the program committee members. All accepted papers will be presented during the workshop. Papers should be submitted through the EasyChair online system at https://easychair.org/conferences/?conf=moda21. Paper submissions are required to be formatted like the ISC research papers using LNCS style (see Springer’s website):
- Single-column format,
- Maximum 12 pages (including figures and references),
- Use Springer’s LaTeX document class or Word template (see Springer’s Proceedings Guidelines),
- The workshop chairs reserve the right to reject incorrectly formatted papers,
- Papers cannot have been previously published or simultaneously under review.
The workshop papers will be published together with the ISC 2021 proceedings, including an abstract of the keynote and invited talks, and a short white paper of the panel session.
Abstract Deadline: March 29, 2021 (AoE)
Paper submission deadline: April 5, 2021 (AoE) => April 12, 2021 (AoE) [firm*est* deadline to align with early bird registration deadline]
Notification: May 3, 2021.
Camera-ready: one month after the workshop date (held on July 2, 2021)
Committees
Program Committee
- Andrea Bartolini - University of Bologna, Italy
- Dominik Strassel - Fraunhofer ITWM Kaiserslautern, Germany
- Daniele Cesarini - CINECA, Italy
- Ann Gentile - Sandia National Labs, USA
- Thomas Ilsche - Technische Universität Dresden, Germany
- Jacques-Charles Lafoucriere - CEA, France
- Erwin Laure - TU München and Max Planck Computing and Data Facility, Garching, Germany
- Fiilippo Mantovani - BSC, Spain
- Diana Moise - HPE, Switzerland
- Alessio Netti - LRZ, Germany
- Thomas Roblitz - University of Bergen, Norway
- Melissa Romanus - NERSC LBNL, USA
- Dominik Strassel - ITWN Fraunhofer, Germany
- Ugo Varetto - Pawsey Supercomputing Centre, Australia
- Keiji Yamamoto - RIKEN, Japan
- Ales Zamuda - University of Maribor, Slovenia
Organizing committee
- Florina Ciorba – University of Basel, Switzerland
- Utz-Uwe Haus - HPE EMEA Research Lab, Switzerland
- Nicolas Lachiche - University of Strasbourg, France
- Daniele Tafani - Fujitsu, Germany
Venue
The conference will be held virtually, as part of the ISC 2021 Digital Conference, on July 2, 2-6 PM CET.
Program (virtual, online)
Contact
All questions about submissions should be emailed to:
- Florina Ciorba (florina.ciorba@unibas.ch) – University of Basel, Switzerland
- Utz-Uwe Haus (utz-uwe.haus@hpe.com) - HPE EMEA Research Lab, Switzerland
- Nicolas Lachiche (nicolas.lachiche@unistra.fr) - University of Strasbourg, France
- Daniele Tafani (daniele.tafani@fujitsu.com) - Fujitsu, Germany