Abstract:Kernels are efficient in representing nonlocal dependence and they are widely used to design operators between function spaces. Thus, learning kernels in operators from data is an inverse problem of general interest. Due to the nonlocal dependence, the inverse problem can be severely ill-posed with a data-dependent singular inversion operator. The Bayesian approach overcomes the ill-posedness through a non-degenerate prior. However, a fixed non-degenerate prior leads to a divergent posterior mean when the observation noise becomes small, if the data induces a perturbation in the eigenspace of zero eigenvalues of the inversion operator. We introduce a data-adaptive prior to achieve a stable posterior whose mean always has a small noise limit. The data-adaptive prior's covariance is the inversion operator with a hyper-parameter selected adaptive to data by the L-curve method. Furthermore, we provide a detailed analysis on the computational practice of the data-adaptive prior, and demonstrate it on Toeplitz matrices and integral operators. Numerical tests show that a fixed prior can lead to a divergent posterior mean in the presence of any of the four types of errors: discretization error, model error, partial observation and wrong noise assumption. In contrast, the data-adaptive prior always attains posterior means with small noise limits.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the ill - posed inverse problem encountered when learning kernels in operators from data. Specifically: 1. **Ill - posedness Caused by Non - local Dependence**: Since the kernel function effectively represents non - local dependence relationships, this makes learning the kernel function from data an ill - posed inverse problem. This ill - posedness is usually manifested as the instability of the data - related regularization operator. 2. **Limitations of Traditional Bayesian Methods**: Traditional Bayesian methods deal with ill - posedness by using non - degenerate priors, but this method may lead to instability of the posterior mean under small - noise conditions, especially when the data causes perturbations in the null space of the regularization operator. 3. **Proposal of Data - Adaptive Priors**: To overcome the above problems, the paper proposes a new data - adaptive Reproducing Kernel Hilbert Space (RKHS) prior. This prior can ensure the stability of the posterior mean under small - noise conditions and shows better performance than fixed non - degenerate priors in numerical experiments. ### Specific Problem Description - **Problem Background**: - The kernel function effectively represents non - local dependence relationships and is widely used when designing operators between function spaces. - Learning the kernel function in an operator is a linear inverse problem, but due to non - local dependence and various perturbations (such as data noise, numerical errors or model errors), this problem is usually severely ill - posed. - **Deficiencies of Traditional Methods**: - Traditional Bayesian methods use non - degenerate priors to deal with ill - posedness, but this method may lead to instability of the posterior mean under small - noise conditions. - Fixed non - degenerate priors perform poorly when facing perturbations in the null space of the regularization operator caused by data. - **Solution Proposed in the Paper**: - A new data - adaptive RKHS prior is proposed to ensure the stability of the posterior mean under small - noise conditions. - The effectiveness of this prior is verified through analysis and numerical experiments, especially in the learning of discrete and continuous kernels. ### Mathematical Representation - **Loss Function**: \[ E(\phi)=\frac{1}{N\sigma^2_{\eta}}\sum_{k = 1}^N\|R_{\phi}[u_k]-f_k\|^2_Y=\frac{1}{2\sigma^2_{\eta}}\left[\langle L_G\phi,\phi\rangle_{L^2_{\rho}}-2\langle\phi_D,\phi\rangle_{L^2_{\rho}}+C_f\right] \] - **Posterior Mean**: - Posterior mean using the fixed non - degenerate prior \(N(0,Q_0)\): \[ \mu_1=(L_G+\sigma^2_{\eta}Q_0)^{-1}\phi_D \] - Posterior mean using the data - adaptive RKHS prior \(N(0,\lambda^{-1}_*L_G)\): \[ \mu_{D1}=(L_G^2+\sigma^2_{\eta}\lambda_*I_{\text{Null}(L_G)^{\perp}})^{-1}L_G\phi_D \] ### Conclusion By introducing the data - adaptive RKHS prior, the paper solves the problem of the stability of the posterior mean under small - noise conditions and verifies its effectiveness in learning discrete and continuous kernels through numerical experiments. This method provides a new idea for dealing with ill - posed inverse problems.

A Data-Adaptive Prior for Bayesian Learning of Kernels in Operators

An Efficient Variational Bayesian Inference Approach Via Studient's-t Priors for Acoustic Imaging in Colored Noises

Bayesian Inference and Deep Learning for Inverse Problems

Adaptive operator learning for infinite-dimensional Bayesian inverse problems

Posterior Contraction for Empirical Bayesian Approach to Inverse Problems under Non-Diagonal Assumption

A posterior contraction for Bayesian inverse problems in Banach spaces

Learning Data-adaptive Nonparametric Kernels

Bayesian Posterior Contraction Rates for Linear Severely Ill-posed Inverse Problems

Residual-based error correction for neural operator accelerated infinite-dimensional Bayesian inverse problems

Linear methods for non-linear inverse problems

Taming Score-Based Diffusion Priors for Infinite-Dimensional Nonlinear Inverse Problems

Adaptive Bayesian Regression on Data with Low Intrinsic Dimensionality

What's in a Prior? Learned Proximal Networks for Inverse Problems

Ideal Bayesian Spatial Adaptation

Towards a variational principle for motivated vehicle motion

Bayesian Approach to Inverse Problems for Functions with A Variable-Index Besov Prior

Nonparametric learning of kernels in nonlocal operators

A Data-Driven Bayesian Nonparametric Approach for Black-Box Optimization

A data-driven adaptive regularization method and its applications

Minimax Optimal Kernel Operator Learning via Multilevel Training

Bayesian inference for transductive learning of kernel matrix using the Tanner-Wong data augmentation algorithm