Multi-Objective Hyperparameter Optimization in Machine Learning -- An Overview

Florian Karl,Tobias Pielok,Julia Moosbauer,Florian Pfisterer,Stefan Coors,Martin Binder,Lennart Schneider,Janek Thomas,Jakob Richter,Michel Lang,Eduardo C. Garrido-Merchán,Juergen Branke,Bernd Bischl
DOI: https://doi.org/10.1145/3610536
2024-06-06
Abstract:Hyperparameter optimization constitutes a large part of typical modern machine learning workflows. This arises from the fact that machine learning methods and corresponding preprocessing steps often only yield optimal performance when hyperparameters are properly tuned. But in many applications, we are not only interested in optimizing ML pipelines solely for predictive accuracy; additional metrics or constraints must be considered when determining an optimal configuration, resulting in a multi-objective optimization problem. This is often neglected in practice, due to a lack of knowledge and readily available software implementations for multi-objective hyperparameter optimization. In this work, we introduce the reader to the basics of multi-objective hyperparameter optimization and motivate its usefulness in applied ML. Furthermore, we provide an extensive survey of existing optimization strategies, both from the domain of evolutionary algorithms and Bayesian optimization. We illustrate the utility of MOO in several specific ML applications, considering objectives such as operating conditions, prediction time, sparseness, fairness, interpretability and robustness.
Machine Learning
What problem does this paper attempt to address?
This paper focuses on the application of multi-objective hyperparameter optimization in machine learning. In traditional machine learning workflows, hyperparameter optimization plays a crucial role because adjusting hyperparameters correctly is essential for achieving optimal performance. However, in practical applications, we may need to consider multiple objectives or constraints in addition to prediction accuracy, such as runtime conditions, prediction time, sparsity, fairness, and interpretability, which leads to the emergence of multi-objective optimization problems. Currently, knowledge and software implementations in this field are relatively scarce. The paper introduces the basic knowledge of multi-objective optimization and emphasizes its value in applied machine learning. The authors extensively investigate existing optimization strategies, including evolutionary algorithms and Bayesian optimization methods. Through specific machine learning application examples, the paper demonstrates the role of multi-objective optimization in dealing with various objectives such as operational conditions, prediction time, model complexity, fairness, and robustness. The paper also discusses the challenges of integrating multiple objectives into a single metric, pointing out that it is often difficult to predefine trade-offs without knowing the potential solutions. Therefore, it is meaningful to directly deal with multi-objective hyperparameter optimization problems and find the Pareto optimal solution set, so that experts can analyze these solutions later and make informed decisions based on specific applications. In summary, this paper aims to provide a comprehensive introduction and review of multi-objective hyperparameter optimization for machine learning practitioners, and it may also inspire researchers and practitioners familiar with multi-objective optimization.