Improving Steepest Descent Method by Learning Rate Annealing and Momentum in Neural Network
Udai Bhan Trivedi,Priti Mishra
DOI: https://doi.org/10.1007/978-981-15-7804-5_14
2020-11-26
Abstract:The backpropagation method of neural network (BPNN) method which is an important algorithm in machine learning has been applied to wide range of problem like pattern recognition, optimization, approximation, classification, and data clustering in real world. BPNN algorithm has been widely used in age estimation, pedestrian gender classification, traffic sign recognition, character recognition, water pollution forecasting models, heart disease classification, breast cancer detection (Keskar et al. in On large-batch training for deep learning: Generalization gap and sharp minima, 2016) [1], remote sensing, and image classification (He et al. in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 558–567, 2019) [2]. Algorithm uses a steepest gradient method and suffers with some limitations like convergence to local minima and slow convergence velocity of learning. This research proposed solution for the slow learning convergence velocity by implementing learning rate annealing which implements anneal the learning rate (decline as time progresses) rather than constant learning rate throughout the training. The problem of local minima can be address by momentum, which can be calculated by adding fraction of the past weights updates to the calculation of current weight.