Regularization and Iterative Initialization of Softmax for Fast Training of Convolutional Neural Networks.

Qiang Rao,Bing Yu,Kun He,Bailan Feng
DOI: https://doi.org/10.1109/ijcnn.2019.8852459
2019-01-01
Abstract:A softmax regularizer is proposed, a simple and elegant constraint on softmax weight distribution in the training process. Since the direct estimation of feature centers is neither memory efficient nor robust, the proposed regularizer utilizes the relations between feature centers and the classifier weights by adding constraints on the distances between softmax weight vectors. This apparently enlarges the distances between softmax weights to benefit the separation of different classes, and provides extra gradients for the optimization of softmax in order to speed up the training process. Furthermore, we argue that the massive amount of softmax parameters is the main cause that makes the network converge slowly, especial in the class classification tasks with large class number such as face recognition. Motivated by the analysis of the relations between deep features and softmax weights, a fast training process is presented, which splits the training into multiple stages and alternates training and initializing softmax weights for fast convergence when the class number is large. Since the softmax weights can be initialized with estimated deep feature centers, the scale of training data can be gradually increased along the stages. By this procedure, the total training computation cost can be reduced. To validate its effectiveness, our approach is applied on both face recognition and image classification tasks. It obtains comparable performance with the state-of-the-art methods while boasting a faster training process.
What problem does this paper attempt to address?