Abstract:This thesis presents a novel approach to neural network training that addresses the challenge of determining the optimal number of learning factors. The proposed Adaptive Multiple Optimal Learning Factors (AMOLF) algorithm dynamically adjusts the number of learning factors based on the error change per multiply, leading to improved training efficiency and accuracy. The thesis also introduces techniques for grouping weights based on the curvature of the objective function and for compressing large Hessian matrices. Experimental results demonstrate the superior performance of AMOLF compared to existing methods like OWO-MOLF and Levenberg-Marquardt.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the selection and optimization of the number of learning factors in the training process of multi - layer perceptron (MLP). Specifically, the author proposes an Adaptive Multiple Optimal Learning Factors (AMOLF) algorithm, aiming to improve the training efficiency and performance of neural networks by dynamically adjusting the number of learning factors. ### Core Problems of the Paper 1. **Uncertainty of the Number of Learning Factors**: - In traditional neural network training, determining the required number of learning factors has always been a difficult problem. Too many or too few learning factors may lead to poor training results. 2. **Limitations of Existing Algorithms**: - Existing training algorithms such as OWO - MOLF and Levenberg - Marquardt do not perform ideally on some datasets, especially when dealing with large - scale, ill - conditioned problems. ### Solutions To solve the above problems, the author introduces a new method, namely the Adaptive Multiple Optimal Learning Factors algorithm. The main features of this algorithm include: - **Adaptive Adjustment of the Number of Learning Factors**: Dynamically adjust the number of learning factors according to the error change brought by each multiplication operation. - **Group - based Calculation of Learning Factors Based on the Curvature of the Objective Function**: Group the weights according to the curvature of the objective function and calculate the optimal learning factor for each group. - **Linear Compression of the Hessian Matrix**: Linearly compress the large - scale ill - conditioned Newton Hessian matrix into a smaller well - conditioned matrix, thereby reducing the computational complexity. ### Performance Improvement The paper verifies through experiments that the AMOLF algorithm outperforms the OWO - MOLF and Levenberg - Marquardt algorithms on multiple datasets, especially in terms of the error - decreasing speed. ### Formula Representation To understand this algorithm more clearly, the following are several key formulas: - **Error Function**: \[ E=\frac{1}{N}\sum_{p = 1}^{N_v}\sum_{i = 1}^{M}(y_p(i)-t_p(i))^2 \] where \(y_p(i)\) is the actual output, \(t_p(i)\) is the expected output, \(N_v\) is the number of training samples, and \(M\) is the number of output units. - **Optimal Learning Factor (OLF)**: \[ z =-\frac{\left.\frac{\partial E}{\partial z}\right|_{z = 0}}{\left.\frac{\partial^2 E}{\partial z^2}\right|_{z = 0}} \] - **Hessian Matrix Element**: \[ H_{k,j}=\sum_{m = 1}^{M}\sum_{n = 1}^{N}\sum_{p = 1}^{N_v}\frac{\partial y_p(m)}{\partial w(k,n)}\cdot\frac{\partial y_p(m)}{\partial w(j,n)} \] Through these improvements, the AMOLF algorithm can show better performance on different datasets and solve the shortcomings of traditional algorithms in large - scale and complex problems.

Adaptive multiple optimal learning factors for neural network training

Effective Neural Network Training with a New Weighting Mechanism-Based Optimization Algorithm.

Optimal Training of Feedforward Neural Networks Using Teaching-Learning-Based Optimization with Modified Learning Phases

A Multilayer Complex Neural Network Training Algorithm and Its Application in Adaptive Equalization

Optimizing connection weights in neural networks using the whale optimization algorithm

Using Fitness Dependent Optimizer for Training Multi-layer Perceptron

Ant Lion Optimizer: Theory, Literature Review, and Application in Multi-layer Perceptron Neural Networks

Adaptive Optimization Algorithms for Machine Learning

Efficient Adaptive Optimization via Subset-Norm and Subspace-Momentum: Fast, Memory-Reduced Training with Convergence Guarantees

Orthogonal Weight Normalization: Solution to Optimization over Multiple Dependent Stiefel Manifolds in Deep Neural Networks

Improving Levenberg-Marquardt Algorithm for Neural Networks

A Neural Network Transformation based Global Optimization Algorithm

Adaptive Levenberg–Marquardt Algorithm: A New Optimization Strategy for Levenberg–Marquardt Neural Networks

Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

Accelerated Gradient-free Neural Network Training by Multi-convex Alternating Optimization

An Adaptive and Momental Bound Method for Stochastic Learning

A Convergent ADMM Framework for Efficient Neural Network Training

An Efficient Optimization Technique for Training Deep Neural Networks

Efficient Second-Order Neural Network Optimization via Adaptive Trust Region Methods

Adaptive Learning Rates with Maximum Variation Averaging.

Convergence Rates of Training Deep Neural Networks Via Alternating Minimization Methods.