Tags:Dynamic Resource Management, High-Performance Computing and Software Stacks
Abstract:
Dynamic Resource Management (DRM) facilitates the real-time adjustment of resources allocated to a job during its execution. Recently, DRM has gained interest due to its potential to contribute to sustainable HPC by enhancing energy efficiency and increasing throughput for HPC system providers and users.
Implementing DRM requires comprehensive changes and coordination across the entire HPC software stack. This poster paper explores a layered approach for DRM highlighting ongoing work on various solutions for DRM across the HPC software stack. These solutions include the OAR resource manager, the Dynamic Processes with PSets (DPP) approach, the Dynamic Management of Resources (DMR) framework, and an adaptive parallel-in-time integration method within the SWEET software utilizing LibPFASST.
We showcase how these approaches create a cohesive strategy for DRM across the whole HPC software stack, demonstrating our collaborative effort for an integrated solution.
A Layered Approach for Dynamic Resource Management in HPC