What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to determine the minimum number of neurons required in the fully - connected layer without the need for multiple neural network trainings. Specifically, the author proposes an algorithm aimed at finding the minimum number of neurons in each fully - connected layer in any network architecture without multiple trainings for different numbers of neurons. ### Problem Background In deep learning, determining the number of neurons in the fully - connected layer is an important hyper - parameter selection problem. Traditional methods usually find the optimal solution by training the network multiple times and adjusting the number of neurons, but this requires a large amount of computing resources and time. Therefore, researchers have been looking for more efficient methods to estimate the minimum number of neurons. ### Main Contributions of the Paper 1. **Algorithm Design**: The paper proposes an algorithm based on the truncated singular value decomposition (SVD) auto - encoder, which can search for the minimum number of neurons in the inference mode. This algorithm first trains the initial wide network using the cross - validation method, and then inserts the SVD auto - encoder to search for the minimum number of neurons. 2. **Theoretical Analysis**: The author proves that the minimum number of neurons can be regarded as an internal (latent) property of the network architecture, training data set, layer position, and quality metric, rather than a property related to other hyper - parameters. This means that the minimum number of neurons in each hidden fully - connected layer can be independently estimated. 3. **Experimental Verification**: The algorithm has been tested on multiple data sets, including classification and regression tasks. The experimental results show that the algorithm can stably estimate the minimum number of neurons, and the found network is comparable in performance to the original network. ### Formula Summary - **Relationship between Matrix Rank and Minimum Number of Neurons**: \[ \min(M)\geq\text{rank}(Y) \] where \(M\) is the number of neurons in a certain layer, \(Y\) is the output matrix of this layer, and \(\text{rank}(Y)\) represents the rank of matrix \(Y\). - **Statistical Equivalence Threshold**: \[ Q_0 = \frac{Q@val(S)+\text{Best}(Q)}{2} \] where \(Q_0\) is the threshold of statistical equivalence, \(Q@val(S)\) is the quality metric on the validation set, and \(\text{Best}(Q)\) is the best possible quality metric. - **SVD Auto - encoder Transformation**: \[ Y'=YA(M)A(M)^+ \] where \(A(M)\) and \(A(M)^+\) are the direct matrix and the pseudo - inverse matrix of the truncated SVD transformation respectively. ### Conclusion The algorithm proposed in this paper provides an efficient and stable method to estimate the minimum number of neurons in the fully - connected layer without multiple network trainings. This not only saves computing resources but also provides a new perspective for understanding the internal structure of neural networks.

Minimum number of neurons in fully connected layers of a given neural network (the first approximation)

Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons

A Method for Estimating the Number of Hidden Neurons in Feed-Forward Neural Networks Based on Information Entropy

Optimal Neural Network Approximation for High-Dimensional Continuous Functions

Searching for Minimal Optimal Neural Networks

Layer-Specific Optimization: Sensitivity Based Convolution Layers Basis Search

Mini-max Initialization for Function Approximation.

New advances in universal approximation with neural networks of minimal width

How much pre-training is enough to discover a good subnetwork?

Learning Minimal Neural Specifications

Neural networks: deep, shallow, or in between?

A Note on Connectivity of Sublevel Sets in Deep Learning

On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems

How many Neurons do we need? A refined Analysis for Shallow Networks trained with Gradient Descent

Sensitivity-Based Layer Insertion for Residual and Feedforward Neural Networks

Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Stably unactivated neurons in ReLU neural networks

Learning the number of nodes in DNNs with activation mask.

Approximation error of single hidden layer neural networks with fixed weights

Optimal Deep Neural Networks by Maximization of the Approximation Power