Abstract:Random feature methods have been successful in various machine learning tasks, are easy to compute, and come with theoretical accuracy bounds. They serve as an alternative approach to standard neural networks since they can represent similar function spaces without a costly training phase. However, for accuracy, random feature methods require more measurements than trainable parameters, limiting their use for data-scarce applications or problems in scientific machine learning. This paper introduces the sparse random feature expansion to obtain parsimonious random feature models. Specifically, we leverage ideas from compressive sensing to generate random feature expansions with theoretical guarantees even in the data-scarce setting. In particular, we provide generalization bounds for functions in a certain class (that is dense in a reproducing kernel Hilbert space) depending on the number of samples and the distribution of features. The generalization bounds improve with additional structural conditions, such as coordinate sparsity, compact clusters of the spectrum, or rapid spectral decay. In particular, by introducing sparse features, i.e. features with random sparse weights, we provide improved bounds for low order functions. We show that the sparse random feature expansions outperforms shallow networks in several scientific machine learning tasks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to improve the generalization ability of the random feature method in the case of scarce data, especially for the approximation of high - dimensional low - order functions. Specifically, the paper introduces a method called Sparse Random Feature Expansion (SRFE). By using the idea of compressed sensing, it generates a random feature expansion with theoretical guarantees and can provide good generalization performance even in the case of scarce data. The paper also provides generalization bounds for function classes (dense in the reproducing kernel Hilbert space), and these bounds vary with the number of samples and the feature distribution. In addition, by introducing sparse features (i.e., features with random sparse weights), the paper provides improved bounds for low - order functions and shows that the sparse random feature expansion outperforms the performance of shallow networks in multiple scientific machine - learning tasks. ### Background and Motivation of the Paper 1. **Advantages and Limitations of the Random Feature Method** - The random feature method has been successful in various machine - learning tasks. It is easy to calculate and has theoretically accurate boundaries. - Compared with standard neural networks, the random feature method can represent a similar function space without expensive training. - However, in order to achieve high accuracy, the random feature method usually requires more measurement data than trainable parameters, which limits its use in data - scarce applications or scientific machine - learning problems. 2. **Proposal of Sparse Random Feature Expansion (SRFE)** - To overcome the above limitations, the paper proposes Sparse Random Feature Expansion (SRFE) to obtain a more concise random feature model. - SRFE uses the idea of compressed sensing to generate a random feature expansion with theoretical guarantees and can work effectively even in the case of scarce data. ### Main Contributions 1. **Proposal of the Sparse Feature Model** - A new sparse feature model (SRFE) is proposed. This model improves the compressed sensing and polynomial chaos expansion (PCE) methods by using the Random Fourier Feature (RFF) method. - SRFE outperforms standard shallow neural networks in the case of limited data. 2. **Theoretical Analysis** - The boundaries of sample complexity and feature complexity are provided, and these boundaries control the error between SRFE and the target function. - It is proved that in the case of high - dimensional low - order functions, SRFE can achieve a generalization boundary of \(O(N^{-1/2})\), where the constant depends on the polynomial of the dimension rather than the exponent, thus overcoming the curse of dimensionality. 3. **Selection of Sparse Features** - By introducing sparse feature weights, SRFE performs well in approximating low - order functions and helps to alleviate the approximation problem of high - dimensional functions. ### Mathematical Formulas - **Generalization Boundary** \[ \sqrt{\int_{\mathbb{R}^d} |f(x) - f^\sharp(x)|^2 d\mu} \leq C' \left(1 + \frac{N^{1/2}}{s^{1/2}} m^{-1/4} \log^{1/4} \left(\frac{1}{\delta}\right)\right) \kappa_{s,1}(c^\star) + C \left(1 + \frac{N^{1/2}}{m^{1/4}} \log^{1/4} \left(\frac{1}{\delta}\right)\right) \sqrt{\epsilon^2 \|f\|^2_\rho + 4\nu^2} \] - **Definition of Sparse Feature Weights** \[ \tilde{c}^\star_j := \frac{1}{K} \sum_{\ell = 1}^K \tilde{c}^\star_{\ell,j}, \quad \text{where} \quad \tilde{c}^\star_{\ell,j} = \begin{cases} \frac{\alpha_\ell(\omega_j)}{n \rho(\omega_

Generalization Bounds for Sparse Random Feature Expansions

Sparsity-aware generalization theory for deep neural networks

A General Framework for Enhancing Sparsity of Generalized Polynomial Chaos Expansions

Universal approximation property of Banach space-valued random feature models including random neural networks

On Data-Dependent Random Features for Improved Generalization in Supervised Learning

A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities

Generalization and Estimation Error Bounds for Model-based Neural Networks

Enhanced Expressive Power and Fast Training of Neural Networks by Random Projections

Feature Variance Regularization: A Simple Way to Improve the Generalizability of Neural Networks

Sparse random feature maps for the item-multiset kernel

Estimating the Generalization in Deep Neural Networks via Sparsity

SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization.

Understanding Generalization in Deep Learning via Tensor Methods

SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

A Duality Framework for Generalization Analysis of Random Feature Models and Two-Layer Neural Networks.

High-dimensional Model Recovery from Random Sketched Data by Exploring Intrinsic Sparsity

Invariant-Feature Subspace Recovery: A New Class of Provable Domain Generalization Algorithms

Bayes-optimal Learning of Deep Random Networks of Extensive-width

The generalization error of random features regression: Precise asymptotics and double descent curve

Do Compressed Representations Generalize Better?

Robust Sparse Recovery with Sparse Bernoulli matrices via Expanders