Reducing Bias in Deep Learning Optimization: The RSGDM Approach

Honglin Qin,Hongye Zheng,Bingxing Wang,Zhizhong Wu,Bingyao Liu,Yuanfang Yang
2024-09-06
Abstract:Currently, widely used first-order deep learning optimizers include non-adaptive learning rate optimizers and adaptive learning rate optimizers. The former is represented by SGDM (Stochastic Gradient Descent with Momentum), while the latter is represented by Adam. Both of these methods use exponential moving averages to estimate the overall gradient. However, estimating the overall gradient using exponential moving averages is biased and has a lag. This paper proposes an RSGDM algorithm based on differential correction. Our contributions are mainly threefold: 1) Analyze the bias and lag brought by the exponential moving average in the SGDM algorithm. 2) Use the differential estimation term to correct the bias and lag in the SGDM algorithm, proposing the RSGDM algorithm. 3) Experiments on the CIFAR datasets have proven that our RSGDM algorithm is superior to the SGDM algorithm in terms of convergence accuracy.
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the common issues of bias and lag in deep learning optimization. Specifically, widely used deep learning optimizers such as SGDM (Stochastic Gradient Descent with Momentum) and Adam use exponential moving averages to estimate the overall gradient. However, this method has bias and lag, which affect the convergence speed and accuracy during the optimization process. To solve these problems, the paper proposes a new algorithm based on differential correction—RSGDM (Reduced Bias Stochastic Gradient Descent with Momentum). ### Main Contributions: 1. **Analysis of Bias and Lag**: A detailed analysis of the bias and lag issues caused by the use of exponential moving averages in the SGDM algorithm. 2. **Proposing the RSGDM Algorithm**: By introducing a differential estimation term to correct the bias and lag in the SGDM algorithm, the RSGDM algorithm is proposed. 3. **Experimental Validation**: Experiments were conducted on the CIFAR-10 and CIFAR-100 datasets, demonstrating that the RSGDM algorithm outperforms the traditional SGDM algorithm in terms of convergence accuracy. ### Experimental Results: - On the CIFAR-10 dataset, the test accuracy of RSGDM is 0.14% higher than that of SGDM. - On the CIFAR-100 dataset, the test accuracy of RSGDM is 0.57% higher than that of SGDM. These results indicate that the RSGDM algorithm has achieved significant effects in reducing bias and lag, thereby improving the training effectiveness and generalization ability of deep learning models.