Abstract:The energy consumption of an exascale High-Performance Computing (HPC) supercomputer rivals that of tens of thousands of people in terms of electricity demand. Given the substantial energy footprint of exascale HPC systems and the increasing strain on power grids due to climate-related events, electricity providers are starting to impose power caps during critical periods to their users. In this context, it becomes crucial to implement strategies that manage the power consumption of supercomputers while simultaneously ensuring their uninterrupted operation.This paper investigates the proposition that HPC users can willingly sacrifice some processing performance to contribute to a global energy-saving initiative. With the objective of offering an efficient energy-saving strategy by involving users, we introduce a user-assisted supercomputer power-capping methodology. In this approach, users have the option to voluntarily permit their applications to operate in a power-capped mode, denoted as 'Eco-Mode', as necessary. Leveraging HPC simulations, along with energy traces and application metadata derived from a recent Top500 HPC supercomputer, we conducted an experimental campaign to quantify the effects of Eco-Mode on energy conservation and on user experience. Specifically, our study aimed to demonstrate that, with a sufficient number of users choosing Eco-Mode, the supercomputer maintains good performances within the specified power cap. Furthermore, we sought to determine the optimal conditions regarding the number of users embracing Eco-Mode and the magnitude of power capping required for applications (i.e., the intensity of Eco-Mode). Our findings indicate that decreasing the speed of jobs can decrease significantly the number of jobs that must be killed. Moreover, as the adoption of Eco-Mode increases among users, the likelihood of every job to be killed also decreases.

Energy-aware operation of HPC systems in Germany

Energy Consumption Optimisation In Hpc Service Centres

iDataCool: HPC with Hot-Water Cooling and Energy Reuse

Green HPC: An analysis of the domain based on Top500

Pricing Schemes for Energy-Efficient HPC Systems: Design and Exploration

Power Profile Monitoring and Tracking Evolution of System-Wide HPC Workloads

Brainware for green HPC

Seven Pillars To Achieve Energy Efficiency In High-Performance Computing Data Centers

A review on the decarbonization of high-performance computing centers

Run your HPC jobs in Eco-Mode: revealing the potential of user-assisted power capping in supercomputing systems

Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

Cross-layer Application-aware Power/Energy Management for Extreme Scale Science

Automatic energy status controlling with dynamic voltage scaling in poweraware high performance computing cluster

Power Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer (preliminary version)

Energy Wall for Exascale Supercomputing

Energy-Efficient Scheduling of HPC Applications in Cloud Computing Environments

Modeling and Predicting Power Consumption of High Performance Computing Jobs

Generic and ML Workloads in an HPC Datacenter: Node Energy, Job Failures, and Node-Job Analysis

Interactive and Urgent HPC: Challenges and Opportunities

Managing Server Clusters on Renewable Energy Mix

Sustainability in HPC: Vision and Opportunities