Spend More to Save More (SM2): An Energy-Aware Implementation of Successive Halving for Sustainable Hyperparameter Optimization

Daniel Geissler,Bo Zhou,Sungho Suh,Paul Lukowicz
2024-12-12
Abstract:A fundamental step in the development of machine learning models commonly involves the tuning of hyperparameters, often leading to multiple model training runs to work out the best-performing configuration. As machine learning tasks and models grow in complexity, there is an escalating need for solutions that not only improve performance but also address sustainability concerns. Existing strategies predominantly focus on maximizing the performance of the model without considering energy efficiency. To bridge this gap, in this paper, we introduce Spend More to Save More (SM2), an energy-aware hyperparameter optimization implementation based on the widely adopted successive halving algorithm. Unlike conventional approaches including energy-intensive testing of individual hyperparameter configurations, SM2 employs exploratory pretraining to identify inefficient configurations with minimal energy expenditure. Incorporating hardware characteristics and real-time energy consumption tracking, SM2 identifies an optimal configuration that not only maximizes the performance of the model but also enables energy-efficient training. Experimental validations across various datasets, models, and hardware setups confirm the efficacy of SM2 to prevent the waste of energy during the training of hyperparameter configurations.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the hyper - parameter optimization process of machine - learning models, how to simultaneously improve model performance and reduce energy consumption. With the increase in machine - learning tasks and model complexity, existing hyper - parameter optimization methods often only focus on maximizing model performance while ignoring energy efficiency. This has led to a great deal of unnecessary energy waste, especially during large - scale training processes. To address this challenge, the paper proposes a new method named "Spend More to Save More (SM2)". SM2 is an energy - aware hyper - parameter optimization implementation based on the widely - adopted Successive Halving Algorithm (SHA). Its main objective is to minimize energy consumption by identifying inefficient configurations through exploratory pre - training while optimizing hyper - parameter configurations. Specifically, SM2 solves the problem in the following ways: 1. **Energy - aware training**: Energy tracking of the training process is achieved through hardware power monitoring. 2. **SHA strategy deployment**: The Successive Halving Algorithm strategy is used to minimize energy waste. 3. **Extended traditional training**: An exploratory component is introduced to efficiently explore hyper - parameter configurations. 4. **Experimental verification**: The effectiveness of SM2 is verified through experiments with different models, datasets, and hardware settings, demonstrating that energy efficiency can be improved without sacrificing model performance. ### Formula Explanation In the paper, the formula for calculating energy consumption is: \[ E_k=\frac{1}{n}\sum_{i = 0}^{n}\text{power}(k, n)\cdot\text{time}(k)/3600,\quad\forall k\in[0, T] \] where: - \( E_k \) represents the energy consumption of the \( k \) - th epoch. - \( \text{power}(k, n) \) represents the average power of the \( k \) - th epoch. - \( \text{time}(k) \) represents the duration of the \( k \) - th epoch. - \( n \) represents the total number of epochs. - \( T \) represents the total time interval. In addition, SM2 uses an objective function to comprehensively evaluate model performance, energy consumption per round, and learning rate: \[ f(\alpha,\beta)=\alpha\times P+(1 - \alpha)\times(\beta\times E+(1 - \beta)\times LR) \] where: - \( P \) represents model performance. - \( E \) represents energy consumption per round. - \( LR \) represents the selected learning rate. - \( \alpha \) and \( \beta \) are weighting parameters used to balance the influence of these three attributes. In this way, SM2 achieves the goal of taking into account both performance and energy efficiency while optimizing hyper - parameters.