Abstract:Working with high-dimensional data is a common practice, in the field of machine learning. Identifying relevant input features is thus crucial, so as to obtain compact dataset more prone for effective numerical handling. Further, by isolating pivotal elements that form the basis of decision making, one can contribute to elaborate on - ex post - models' interpretability, so far rather elusive. Here, we propose a novel method to estimate the relative importance of the input components for a Deep Neural Network. This is achieved by leveraging on a spectral re-parametrization of the optimization process. Eigenvalues associated to input nodes provide in fact a robust proxy to gauge the relevance of the supplied entry features. Unlike existing techniques, the spectral features ranking is carried out automatically, as a byproduct of the network training. The technique is successfully challenged against both synthetic and real data.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to automatically identify the importance of each feature in the input layer of the deep neural network (DNN) in high - dimensional data, so as to achieve data compression, improve the interpretability of the model, and provide a more compact data set for subsequent data processing. Specifically: 1. **Feature selection in high - dimensional data**: In the field of machine learning, especially when using deep neural networks, the input data usually has high dimensions. How to screen out the features that are crucial for model decision - making from these high - dimensional data is an important issue. Traditional feature selection methods are either pre - processed before training or post - hoc explained after training, but these methods have limitations. 2. **Model interpretability**: By identifying which input features are most critical to the model's decision - making, the interpretability of the model can be enhanced. This is very important for understanding the working mechanism of the model and increasing users' trust in the model. 3. **Automated feature importance assessment**: Existing feature importance assessment methods usually rely on additional calculation steps or specific input samples, while the method proposed in this paper can automatically generate feature importance scores during the training process without additional processing steps. To achieve this goal, the author proposes a method based on spectral parametrization. The importance of input features is measured by the eigenvalues corresponding to the feature nodes in the optimization process. This method can not only automatically rank the input features, but also provide a global perspective on feature importance assessment without relying on specific input samples. ### Method overview - **Spectral parametrization**: Represent the traditional weight matrix \(W\) as \(A = \Phi\Lambda\Phi^{-1}\) through spectral decomposition, where \(\Lambda\) is the eigenvalue matrix and \(\Phi\) is the eigenvector matrix. - **Eigenvalues as importance indicators**: The importance of each input feature is measured by the eigenvalues \(\lambda_i\) obtained in the optimization process. Larger eigenvalues indicate that the feature is more important for model decision - making. - **Automatic feature ranking**: During the training process, the eigenvalues will be automatically adjusted, and finally form a ranking that reflects the importance of input features. ### Experimental verification The author verified the effectiveness of this method through multiple experiments, including: - Independent Gaussian distribution data set - Correlated Gaussian distribution data set - MNIST data set The experimental results show that this method can effectively identify the features that are crucial for classification tasks and performs well in practical applications. In conclusion, this paper aims to solve the problems of feature selection and model interpretability in high - dimensional data, and proposes a new method based on spectral parametrization, which can automatically evaluate and rank the importance of input features during the training process.

Automatic Input Feature Relevance via Spectral Neural Networks

Complex Recurrent Spectral Network

Recurrent Spectral Network (RSN): shaping the basin of attraction of a discrete map to reach automated classification

A Set Membership Approach to Discovering Feature Relevance and Explaining Neural Classifier Decisions

Importance estimate of features via analysis of their weight and gradient profile

Spectral Neural Networks: Approximation Theory and Optimization Landscape

How good Neural Networks interpretation methods really are? A quantitative benchmark

Lyapunov-Guided Representation of Recurrent Neural Network Performance

Retrieving genuine nonlinear Raman responses in ultrafast spectroscopy via deep learning

FsNet: Feature Selection Network on High-dimensional Biological Data

Enhancing the classification metrics of spectroscopy spectrums using neural network based low dimensional space

AFS: An Attention-based mechanism for Supervised Feature Selection

Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation

Learning active subspaces and discovering important features with Gaussian radial basis functions neural networks

Interpreting Deep Neural Networks Through Variable Importance

Spectral methods for Neural Integral Equations

Prior Knowledge Neural Network for Automatic Feature Construction in Financial Time Series

Spectral Self-supervised Feature Selection

Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization

AutoField: Automating Feature Selection in Deep Recommender Systems

A Spectral Theory of Neural Prediction and Alignment