Pre-training the Deep Generative Models with Adaptive Hyperparameter Optimization

Chengwei Yao,Deng Cai,Jiajun Bu,Gencai Chen
DOI: https://doi.org/10.1016/j.neucom.2017.03.058
IF: 6
2017-01-01
Neurocomputing
Abstract:The performance of many machine learning algorithms depends crucially on the hyperparameter settings, especially in Deep Learning. Manually tuning the hyperparameters is laborious and time consuming. To address this issue, Bayesian optimization (BO) methods and their extensions have been proposed to optimize the hyperparameters automatically. However, they still suffer from highly computational expense when applying to deep generative models (DGMs) due to their strategy of the black-box function optimization. This paper provides a new hyperparameter optimization procedure at the pre-training phase of the DGMs, where we avoid combining all layers as one black-box function by taking advantage of the layer-by-layer learning strategy. Following this procedure, we are able to optimize multiple hyperparameters in an adaptive way by using Gaussian process. In contrast to the traditional BO methods, which mainly focus on the supervised models, the pre-training procedure is unsupervised where there is no validation error can be used. To alleviate this problem, this paper proposes a new holdout loss, the free energy gap, which takes into account both factors of the model fitting and over-fitting. The empirical evaluations demonstrate that our method not only speeds up the process of hyperparameter optimization, but also improves the performances of DGMs significantly in both the supervised and unsupervised learning tasks.
What problem does this paper attempt to address?