Abstract:A fundamental step in the development of machine learning models commonly involves the tuning of hyperparameters, often leading to multiple model training runs to work out the best-performing configuration. As machine learning tasks and models grow in complexity, there is an escalating need for solutions that not only improve performance but also address sustainability concerns. Existing strategies predominantly focus on maximizing the performance of the model without considering energy efficiency. To bridge this gap, in this paper, we introduce Spend More to Save More (SM2), an energy-aware hyperparameter optimization implementation based on the widely adopted successive halving algorithm. Unlike conventional approaches including energy-intensive testing of individual hyperparameter configurations, SM2 employs exploratory pretraining to identify inefficient configurations with minimal energy expenditure. Incorporating hardware characteristics and real-time energy consumption tracking, SM2 identifies an optimal configuration that not only maximizes the performance of the model but also enables energy-efficient training. Experimental validations across various datasets, models, and hardware setups confirm the efficacy of SM2 to prevent the waste of energy during the training of hyperparameter configurations.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the hyper - parameter optimization process of machine - learning models, how to simultaneously improve model performance and reduce energy consumption. With the increase in machine - learning tasks and model complexity, existing hyper - parameter optimization methods often only focus on maximizing model performance while ignoring energy efficiency. This has led to a great deal of unnecessary energy waste, especially during large - scale training processes. To address this challenge, the paper proposes a new method named "Spend More to Save More (SM2)". SM2 is an energy - aware hyper - parameter optimization implementation based on the widely - adopted Successive Halving Algorithm (SHA). Its main objective is to minimize energy consumption by identifying inefficient configurations through exploratory pre - training while optimizing hyper - parameter configurations. Specifically, SM2 solves the problem in the following ways: 1. **Energy - aware training**: Energy tracking of the training process is achieved through hardware power monitoring. 2. **SHA strategy deployment**: The Successive Halving Algorithm strategy is used to minimize energy waste. 3. **Extended traditional training**: An exploratory component is introduced to efficiently explore hyper - parameter configurations. 4. **Experimental verification**: The effectiveness of SM2 is verified through experiments with different models, datasets, and hardware settings, demonstrating that energy efficiency can be improved without sacrificing model performance. ### Formula Explanation In the paper, the formula for calculating energy consumption is: \[ E_k=\frac{1}{n}\sum_{i = 0}^{n}\text{power}(k, n)\cdot\text{time}(k)/3600,\quad\forall k\in[0, T] \] where: - \( E_k \) represents the energy consumption of the \( k \) - th epoch. - \( \text{power}(k, n) \) represents the average power of the \( k \) - th epoch. - \( \text{time}(k) \) represents the duration of the \( k \) - th epoch. - \( n \) represents the total number of epochs. - \( T \) represents the total time interval. In addition, SM2 uses an objective function to comprehensively evaluate model performance, energy consumption per round, and learning rate: \[ f(\alpha,\beta)=\alpha\times P+(1 - \alpha)\times(\beta\times E+(1 - \beta)\times LR) \] where: - \( P \) represents model performance. - \( E \) represents energy consumption per round. - \( LR \) represents the selected learning rate. - \( \alpha \) and \( \beta \) are weighting parameters used to balance the influence of these three attributes. In this way, SM2 achieves the goal of taking into account both performance and energy efficiency while optimizing hyper - parameters.

Spend More to Save More (SM2): An Energy-Aware Implementation of Successive Halving for Sustainable Hyperparameter Optimization

Towards Low-Budget Energy Efficiency Design in Additive Manufacturing Based on Variational Scale-Aware Transformer

The Power of Training: How Different Neural Network Setups Influence the Energy Demand

Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

Impact of ML Optimization Tactics on Greener Pre-Trained ML Models

Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AI

Watt For What: Rethinking Deep Learning's Energy-Performance Relationship

On the Benefits of Using Metaheuristics in the Hyperparameter Tuning of Deep Learning Models for Energy Load Forecasting

FastTuning: Enabling Fast and Efficient Hyper-Parameter Tuning with Partitioning and Parallelism of Search Space

A hybrid-model approach for reducing the performance gap in building energy forecasting

Interactive effects of hyperparameter optimization techniques and data characteristics on the performance of machine learning algorithms for building energy metamodeling

AI-driven predictive models for sustainability

Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization

Towards energy-efficient Deep Learning: An overview of energy-efficient approaches along the Deep Learning Lifecycle

Discrete Simulation Optimization for Tuning Machine Learning Method Hyperparameters

Carbon Emissions and Large Neural Network Training

Towards Leveraging AutoML for Sustainable Deep Learning: A Multi-Objective HPO Approach on Deep Shift Neural Networks

Performance and Energy Consumption of Parallel Machine Learning Algorithms

Enhancing Support Vector Machine Performance: A Hybrid Approach with Davidon-Fletcher-Powell Algorithm and Elephant Herding Optimization (EHO-DFP) for Parameter Optimization