High-speed hyperparameter optimization for deep ResNet models in image recognition

Abbas Jafar,Myungho Lee
DOI: https://doi.org/10.1007/s10586-021-03284-6
2021-05-17
Cluster Computing
Abstract:Convolutional Neural Network (CNN) is one of the most widely used deep learning models in pattern and image recognition. It can train a large number of datasets and get valuable results. The deep Residual Network (ResNet) is one of the most innovative CNN architecture to train thousands of layers or more and leads to high performance for complex problems. This deep model trains the neural networks with the idea of identity shortcut connection that skips layers. In this paper, we built a hyperparameter optimization approach for the ResNet models with different numbers of layers and show that our optimization leads to significant performance improvements. We developed a manual search approach by enhancing the traditional data augmentation proposed by the previous approaches. Using the CIFAR-100 dataset for classification, our approach significantly improves the classification error rate. It also reduces the computing time compared with the previous automatic approaches. Our approach leads to 23.01% error rate and 4 h and 15 min of computing time of ResNet-164 for the CIFAR-100 dataset using the NIVIDIA Tesla P100 GPU.
What problem does this paper attempt to address?