EE HPC SOP 2021: Energy Efficient HPC State of the Practice Workshop 2021 Cluster 2021 Portland, OR, United States, September 7-10, 2021 |
Conference website | https://sites.google.com/view/ee-hpc-sop-2021 |
Submission link | https://easychair.org/conferences/?conf=eehpcsop2021 |
Abstract registration deadline | June 25, 2021 |
Submission deadline | June 25, 2021 |
Author Notification | July 23, 2021 |
Camera-ready Submission | July 30, 2021 |
Energy Efficient HPC State of the Practice Workshop (EE HPC SOP 2021)
https://sites.google.com/view/ee-hpc-sop-2021/home
September 7th, 2021 in conjunction with Cluster2021.
Accepted papers will be published by IEEE as part of the Cluster2021 Proceedings.
- Paper Submissions Open: April 30th, 2021
- Paper Submission Deadline: June 25th, 2021 << Deadline extension
- Author Notification: July 23rd, 2021
- Camera Ready Paper: July 30th, 2021
The submission web page for EE HPC SOP 2021 is https://easychair.org/conferences/?conf=eehpcsop2021
Accepted papers must be represented by at least one author and presented at the workshop. The Cluster2021 Conference will be held as a virtual event, so presentation does not require travel.
ABSTRACT: The facility demands for supercomputing centers (SCs) are characterized by electrical power demands for computing systems that scale to tens of megawatts (MW) and millisecond power fluctuations approaching 30MW for the largest systems. The demand for primary electrical distribution capabilities to current large-scale facilities can exceed 60MW, comprising multiple, redundant, and diverse medium-voltage feeders. Despite significant pressure on both Moore’s Law and Dennard scaling, the appetite for ever-larger systems and the subsequent demand for both agile power and effective cooling for these systems continues to grow. Computing trends, in terms of highly optimized hardware platforms that may leverage accelerators or other non-traditional components, scalable and high-performing applications, and the requirements to manage exponentially larger data sets are driving facility demands not envisioned just a few years ago.
SC facilities must consider multiple elements, including the cost to extend or fit existing primary distribution capabilities; the cost and consequence of both trapped and stranded capacity, ever-increasing heat densities for new systems that may render existing cooling mechanisms obsolete or ineffective, increased mandatory use of liquid cooling for portions of the heat load, and wet weights that exceed the carrying capacities of existing raised floor systems. Additionally, the operational costs of these facilities must be balanced versus the demand from the systems owners and users for high availability, high utilization, and low-impact facility maintenance and service demands. To achieve this balance, many SCs continue to innovate their operational design practices and technologies. Solutions seek improved management of both the electrical and mechanical systems, and minimizing long-term facility costs through best practices associated with their design. Some SCs are early adopters and innovators in operational practices and technologies that are geared towards improving energy and power management capabilities. This workshop will explore these operational and technological innovations that span HPC computational systems as well as buildings and building infrastructure.
The purpose of this workshop is to allow for the publication of practices, policies, procedures, and technologies in formal peer-reviewed papers so the broader community can benefit from these experiences. It will expose use cases, lessons learned, and best practices in design, commissioning, and operations. The nature of these papers is generally descriptive with hard experiential data generally gathered through surveys, case studies, and research for practice.
=========================================
## Workshop Topics of interest include (but are not limited to):
=========================================
- Efficiency and operational insights gained from working with emergency remote and/or limited on-site operations (e.g., due to COVID)
- Electrical power distribution
- large HPC power loads and rapid power swings
- electricity service provider relationships with HPC facility
- facility system design and commissioning
- Power and energy measurement, monitoring and control
- operational data collection, aggregation and analytics
- energy and power-aware job scheduling and resource management
- cooling control systems
- standards and open interfaces (e.g., Power API, Redfish, GEOPM, READEX, PowerStack)
- Power and energy procurement considerations
- system requirements (e.g., HPC equipment, software, mechanical systems, facilities)
- operational costs in procurement
- Liquid cooling
- standards and open interfaces (e.g., OCP, ASHRAE)
- facility system design and commissioning
- HPC facility preventative maintenance and management practices for RAS-M (reliability, availability, serviceability, and maintainability)
=========================================
Any paper must:
- not exceed 10 pages, including references. Any paper may be shorter than 10 pages.
- be in PDF format.
- be single-spaced, 2-column numbered pages in IEEE Xplore format (8.5x11-inch paper, margins in inches – top: 0.75, bottom: 1.0, sides:0.625, and between columns:0.25, main text: 10pt).
- include author names and affiliations.
- include appropriate citations of prior work.
The review committee will have ~35 people primarily selected for their expertise and willingness to review at least one paper. Each paper will have at least 3 reviewers. In the past two years, the average number of reviewers per accepted paper was 5 and non-accepted papers had an average of 8 reviewers. Reviewers are required to submit conflicts of interest information.
Submissions will be judged on correctness, novel or innovative approaches to a problem, technical and/or operational strength, written quality, and interest and relevance to the workshop scope. The workshop organizers will provide written reviews for all timely submissions. Editorial review and recommendations may be provided as well.
=========================================
ORGANIZING COMMITTEE
- Natalie Bates, Energy Efficient HPC Working Group:
- Anna Maria Bailey, Lawrence Livermore National Laboratory
- Siddhartha Jana, Intel
- Torsten Wilde, Hewlett Packard Enterprise
ADVISORY COMMITTEE:
- Chris Deprater, Lawrence Livermore National Laboratory
- David Grant, Oak Ridge National Laboratory
- David Martinez, Sandia National Laboratory
- James H. Laros, Sandia National Laboratory
PROGRAM COMMITTEE:
- TBD
=========================================