Abstract:Machine learning (ML) methods are used in most technical areas such as image recognition, product recommendation, financial analysis, medical diagnosis, and predictive maintenance. An important aspect of implementing ML methods involves controlling the learning process for the ML method so as to maximize the performance of the method under consideration. Hyperparameter tuning is the process of selecting a suitable set of ML method parameters that control its learning process. In this work, we demonstrate the use of discrete simulation optimization methods such as ranking and selection (R&S) and random search for identifying a hyperparameter set that maximizes the performance of a ML method. Specifically, we use the KN R&S method and the stochastic ruler random search method and one of its variations for this purpose. We also construct the theoretical basis for applying the KN method, which determines the optimal solution with a statistical guarantee via solution space enumeration. In comparison, the stochastic ruler method asymptotically converges to global optima and incurs smaller computational overheads. We demonstrate the application of these methods to a wide variety of machine learning models, including deep neural network models used for time series prediction and image classification. We benchmark our application of these methods with state-of-the-art hyperparameter optimization libraries such as $hyperopt$ and $mango$. The KN method consistently outperforms $hyperopt$'s random search (RS) and Tree of Parzen Estimators (TPE) methods. The stochastic ruler method outperforms the $hyperopt$ RS method and offers statistically comparable performance with respect to $hyperopt$'s TPE method and the $mango$ algorithm.

Stochastic Hyperparameter Optimization through Hypernetworks

A Method of Adaptive Hyperparameter Optimization for Deep Generative Models

Scalable Nested Optimization for Deep Learning

Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space

Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions

Improving Hyperparameter Optimization with Checkpointed Model Weights

Cross-Entropy Optimization for Hyperparameter Optimization in Stochastic Gradient-based Approaches to Train Deep Neural Networks

A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning

Multi-level Training and Bayesian Optimization for Economical Hyperparameter Optimization

Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights

Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization

An effective algorithm for hyperparameter optimization of neural networks

Fast Hyperparameter Optimization of Deep Neural Networks via Ensembling Multiple Surrogates.

Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning

A Unified Gaussian Process for Branching and Nested Hyperparameter Optimization

Discrete Simulation Optimization for Tuning Machine Learning Method Hyperparameters

On Hyper-parameter Tuning for Stochastic Optimization Algorithms

Agent-based Collaborative Random Search for Hyper-parameter Tuning and Global Function Optimization

A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks

Bayesian Optimization for Hyperparameters Tuning in Neural Networks

Towards optimal hierarchical training of neural networks