An Improved BGE-Adam Optimization Algorithm Based on Entropy Weighting and Adaptive Gradient Strategy

Yichuan Shao,Jiantao Wang,Haijing Sun,Hao Yu,Lei Xing,Qian Zhao,Le Zhang
DOI: https://doi.org/10.3390/sym16050623
2024-05-18
Symmetry
Abstract:This paper introduces an enhanced variant of the Adam optimizer—the BGE-Adam optimization algorithm—that integrates three innovative technologies to augment the adaptability, convergence, and robustness of the original algorithm under various training conditions. Firstly, the BGE-Adam algorithm incorporates a dynamic β parameter adjustment mechanism that utilizes the rate of gradient variations to dynamically adjust the exponential decay rates of the first and second moment estimates (β1 and β2), the adjustment of β1 and β2 is symmetrical, which means that the rules that the algorithm considers when adjusting β1 and β2 are the same. This design helps to maintain the consistency and balance of the algorithm, allowing the optimization algorithm to adaptively capture the trending movements of gradients. Secondly, it estimates the direction of future gradients by a simple gradient prediction model, combining historic gradient information with the current gradient. Lastly, entropy weighting is integrated into the gradient update step. This strategy enhances the model's exploratory nature by introducing a certain amount of noise, thereby improving its adaptability to complex loss surfaces. Experimental results on classical datasets, MNIST and CIFAR10, and gastrointestinal disease medical datasets demonstrate that the BGE-Adam algorithm has improved convergence and generalization capabilities. In particular, on the specific medical image gastrointestinal disease test dataset, the BGE-Adam optimization algorithm achieved an accuracy of 69.36%, a significant improvement over the 67.66% accuracy attained using the standard Adam algorithm; on the CIFAR10 test dataset, the accuracy of the BGE-Adam algorithm reached 71.4%, which is higher than the 70.65% accuracy of the Adam optimization algorithm; and on the MNIST dataset, the BGE-Adam algorithm's accuracy was 99.34%, surpassing the Adam optimization algorithm's accuracy of 99.23%. The BGE-Adam optimization algorithm exhibits better convergence and robustness. This research not only demonstrates the effectiveness of the combination of these three technologies but also provides new perspectives for the future development of deep learning optimization algorithms.
multidisciplinary sciences
What problem does this paper attempt to address?
The paper primarily focuses on improving the Adam optimization algorithm to enhance its performance in training deep learning models. Specifically, the researchers propose an enhanced version of the Adam optimization algorithm—BGE-Adam, which enhances the adaptability, convergence, and robustness of the original Adam algorithm through three innovative techniques: 1. **Dynamic β Parameter Adjustment Mechanism**: This mechanism dynamically adjusts the exponential decay rates of the first and second moment estimates (β1 and β2) based on the rate of change of the gradient, allowing the algorithm to more flexibly capture the trend of gradient changes. 2. **Gradient Prediction Model**: By combining historical gradient information with the current gradient, this model simply predicts the direction of future gradients. This helps in adjusting the parameter update strategy in advance, reducing the possibility of over-updating, and increasing the stability of the training process. 3. **Entropy Weighting**: Entropy weighting is introduced in the gradient update step. By adding a certain amount of random noise in the parameter updates, the model's ability to explore complex loss surfaces is enhanced, improving its adaptability to complex loss surfaces. Experimental results show that the BGE-Adam algorithm demonstrates better convergence speed and generalization ability compared to the standard Adam algorithm on classic datasets such as MNIST and CIFAR10, as well as medical image datasets. For example, on a specific medical image gastrointestinal disease test dataset, the accuracy of the BGE-Adam algorithm reached 69.36%, significantly higher than the 67.66% of the standard Adam algorithm; on the CIFAR10 test dataset, the accuracy reached 71.4%, higher than the 70.65% of the Adam algorithm; on the MNIST dataset, the accuracy was 99.34%, surpassing the 99.23% of the Adam algorithm. In summary, this paper aims to improve the Adam optimization algorithm through the aforementioned three technical innovations, enhancing its performance under different training conditions. These improvements not only validate the effectiveness of the proposed techniques but also provide new perspectives for the future development of deep learning optimization algorithms.