The Optimization of Hyperparameter Based on Mathematics for Gradient Descent Algorithm

Abel C. H. Chen
DOI: https://doi.org/10.1109/iccpct61902.2024.10673060
2024-01-01
Abstract:Gradient descent algorithms are widely considered the primary choice for optimizing deep learning models. However, they often require adjusting various hyperparameters, like the learning rate, among others. These hyperparameters significantly impact both the speed of convergence and the accuracy of the solution. Thus, this study introduces an analytical framework that uses mathematical models to assess the mean error of each objective function concerning gradient descent algorithms. Additionally, this framework aims to identify the most effective hyperparameter values by minimizing the mean error. By analyzing optimization models, generalized principles have been established for setting hyperparameter values. Empirical results demonstrate that our proposed method achieves superior convergence efficiency and reduced errors compared to existing approaches.
What problem does this paper attempt to address?