Robust estimation of a regression function in exponential families
Yannick Baraud,Juntong Chen
DOI: https://doi.org/10.1016/j.jspi.2024.106167
IF: 1.095
2024-03-26
Journal of Statistical Planning and Inference
Abstract:We observe n pairs of independent (but not necessarily i.i.d.) random variables X1=(W1,Y1),...,Xn=(Wn,Yn) and tackle the problem of estimating the conditional distributions Qi⋆(wi) of Yi given Wi=wi for all i∈{1,...,n} . Even though these might not be true, we base our estimator on the assumptions that the data are i.i.d. and the conditional distributions of Yi given Wi=wi belong to a one parameter exponential family Q ̄ with parameter space given by an interval I . More precisely, we pretend that these conditional distributions take the form Qθ(wi)∈Q ̄ for some θ that belongs to a VC-class Θ ̄ of functions with values in I . For each i∈{1,...,n} , we estimate Qi⋆(wi) by a distribution of the same form, i.e. Qθ̂(wi)∈Q ̄ , where θ̂=θ̂(X1,...,Xn) is a well-chosen estimator with values in Θ ̄ . We establish non-asymptotic exponential inequalities for the upper deviations of a Hellinger-type distance between the true conditional distributions of the data and the estimated one based on the exponential family Q ̄ and the class of functions Θ ̄ we chose. We show that our estimation strategy is robust to model misspecification, contamination and the presence of outliers. Besides, when the data are truly i.i.d., the exponential family Q ̄ is suitably parametrized and the conditional distributions Qi⋆(wi) of the form Qθ⋆(wi)∈Q ̄ for some unknown Hölderian function θ⋆ with values in I , we prove that the estimator θ̂ of θ⋆ is minimax (up to a logarithmic factor). Finally, we provide an algorithm for calculating θ̂ when Θ ̄ is a VC-class of functions of low or moderate dimension and we carry out a simulation study to compare its performance to that of the MLE and median-based estimators. The proof of our main result relies on an upper bound, with explicit numerical constants, on the expectation of the supremum of an empirical process over a VC-subgraph class. This bound can be of independent interest.
statistics & probability