Evaluating Initialization of Nelder-Mead Method for Hyperparameter Optimization in Deep Learning

Shintaro Takenaga,Shuhei Watanabe,Masahiro Nomura,Yoshihiko Ozaki,Masaki Onishi,Hitoshi Habe
DOI: https://doi.org/10.1109/icpr48806.2021.9412240
2021-01-10
Abstract:In deep learning, hyperparameters can severely affect the learning model performance. The Nelder-Mead (NM) method is known for showing a superior performance for hyperparameter optimization in deep learning. An initial simplex, one of the initial NM method's values, is usually determined randomly while the search performance strongly depends on the shape of the initial simplex. Therefore, it is necessary to determine a proper initial simplex as previous researchers have proposed methods to construct an initial simplex from one starting point in the bounded search space. In this study, we verified how these methods for constructing an initial simplex contribute to improving the result of hyperparameter optimization in deep learning, by using a simple model and a complicated model. A smaller initial simplex may fail to optimization by bad local minima because there are some bad local minima in both learning models. We concluded that the starting point is not necessarily located close to the origin, and that a larger initial simplex contributes to improving the result of hyperparameter optimization in deep learning.
What problem does this paper attempt to address?