Research Progress on Batch Normalization of Deep Learning and Its Related Algorithms

Abstract:Deep learning has been widely applied to various fields, such as computer vision and natural language processing, and has achieved much better results than earlier machine learning. Today, with the rapid development of information technology, deep neural networks are trained with larger data sets, and the network depth is deepening, making training complicated and speed or accuracy need to be improved. In 2013, Ioffe et al. pointed out that there is a serious problem in the training process of deep neural network, i.e., internal covariate shift. It slows down the training for requiring careful parameter initialization and smaller learning rate. Ioffe et al. put forward batch normalization (BN) to reduce the effect of internal covariate shift, to accelerate the convergence speed of training neural networks. At present, many networks use BN as an important approach to accelerate training. In view of the application value of BN, this paper systematically reviews the research progress of BN and its related algorithms. Firstly, the theory of BN is analyzed. Although BN is simple and helpful, there are also some problems, such as relying on the size of mini-batch, training and inference process are in different ways. Therefore, many scholars have proposed a variety of algorithms based on BN, the advantages and main function of those algorithms are analyzed and summarized. Then, the applications of BN in various neural network fields are summarized. And we sum up other methods to improve the training performance of neural network. At last, we give a summation to whole paper, and point out the future development tendency and research direction of BN.
Computer Science
What problem does this paper attempt to address?