Optimal sampling for least squares approximation with general dictionaries

Philipp Trunschke,Anthony Nouy

2024-10-08

Abstract:We consider the problem of approximating an unknown function in a nonlinear model class from point evaluations. When obtaining these point evaluations is costly, minimising the required sample size becomes crucial. Recently, an increasing focus has been on employing adaptive sampling strategies to achieve this. These strategies are based on linear spaces related to the nonlinear model class, for which the optimal sampling measures are known. However, the resulting optimal sampling measures depend on an orthonormal basis of the linear space, which is known rarely. Consequently, sampling from these measures is challenging in practice. This manuscript presents a sampling strategy that iteratively refines an estimate of the optimal sampling measure by updating it based on previously drawn samples. This strategy can be performed offline and does not require evaluations of the sought function. We establish convergence and illustrate the practical performance through numerical experiments. Comparing the presented approach with standard Monte Carlo sampling demonstrates a significant reduction in the number of samples required to achieve a good estimation of an orthonormal basis.

Numerical Analysis

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to approximate an unknown function from point evaluations in a given nonlinear model class, especially when obtaining these point evaluations is costly, minimizing the required sample size becomes crucial. Recently, more and more attention has been focused on adopting adaptive sampling strategies to achieve this goal. However, these strategies are based on the linear spaces related to the nonlinear model class, and the optimal sampling measure depends on the orthogonal bases of these linear spaces, which are rarely known in practice. Therefore, sampling from these measures is challenging in practical applications. This paper proposes a sampling strategy that gradually refines the estimate of the optimal sampling measure by iteratively updating it according to the previously drawn samples. This strategy can be carried out offline and does not require the evaluation of the function to be sought. The authors establish the convergence of this method and demonstrate its practical performance through numerical experiments. Compared with standard Monte Carlo sampling, this method significantly reduces the number of samples required to achieve a good estimate. Specifically, the paper considers the general problem of an over - complete dictionary of an arbitrary function on a general domain, proposes a simple algorithm, and verifies its effectiveness through numerical experiments. The main contribution lies in using the sampling density \( w^{-1}_{\hat{G}^{(0)}} \rho \) induced by the initial estimate \( \hat{G}^{(0)} \) to improve \( \hat{G}^{(0)} \) itself, thereby gradually improving the accuracy of the estimate through an iterative process.

Optimal sampling for least squares approximation with general dictionaries

Randomized least-squares with minimal oversampling and interpolation in general spaces

Optimal sampling for stochastic and natural gradient descent

Least Squares Approximations in Linear Statistical Inverse Learning Problems

Almost-sure quasi-optimal approximation in reproducing kernel Hilbert spaces

Sequential Sampling for Optimal Weighted Least Squares Approximations in Hierarchical Spaces

Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization

Convergence of alternating minimisation algorithms for dictionary learning

Error Guarantees for Least Squares Approximation with Noisy Samples in Domain Adaptation

Optimal Subsampling Approaches for Large Sample Linear Regression

The Sample Complexity of Dictionary Learning

Adaptive Approximation by Optimal Weighted Least-Squares Methods

Subsampled Optimization: Statistical Guarantees, Mean Squared Error Approximation, and Sampling Method

Complete Dictionary Learning via $\ell^4$-Norm Maximization over the Orthogonal Group

Complete Dictionary Learning Via L4-Norm Maximization over the Orthogonal Group

Boosted optimal weighted least-squares

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

Randomized sketching of nonlinear eigenvalue problems

Derivative-Free Optimization via Adaptive Sampling Strategies

Global Optimization Algorithm through High-Resolution Sampling

Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares