A modification of adaptive moment estimation (adam) for machine learning

Jiaxin Yang,Qiang Long
DOI: https://doi.org/10.3934/jimo.2024014
2024-02-03
Journal of Industrial and Management Optimization
Abstract:In deep learning, the accuracy and generalization ability of the model largely depend on the optimization of the loss function. Up to now, dozens of optimization methods have been used in the deep learning models. Among them, stochastic gradient descent (SGD) is a popular and widely used method, and most of the other up-to-date optimization methods are variants or improvements of the original SGD. Among all the variations and improvements, the adaptive moment estimation (Adam) is one of the classics. However, Adam has also been pointed out to have non-convergence or error-convergence. Combining with the improvement points of the existing algorithms, this paper proposes an improved algorithm based on Adam, called NewAdam. NewAdam is modified from Adam in both search direction and learning rate. We perform a theoretical analysis on it and conduct numerical experiments on three data sets and two network architectures to illustrate the effectiveness of NewAdam.
engineering, multidisciplinary,operations research & management science,mathematics, interdisciplinary applications
What problem does this paper attempt to address?