Alexander R. Luedtke,Mark J. van der Laan
Abstract:Suppose one has a collection of parameters indexed by a (possibly infinite dimensional) set. Given data generated from some distribution, the objective is to estimate the maximal parameter in this collection evaluated at this distribution. This estimation problem is typically non-regular when the maximizing parameter is non-unique, and as a result standard asymptotic techniques generally fail in this case. We present a technique for developing parametric-rate confidence intervals for the quantity of interest in these non-regular settings. We show that our estimator is asymptotically efficient when the maximizing parameter is unique so that regular estimation is possible. We apply our technique to a recent example from the literature in which one wishes to report the maximal absolute correlation between a prespecified outcome and one of p predictors. The simplicity of our technique enables an analysis of the previously open case where p grows with sample size. Specifically, we only require that log(p) grows slower than the square root of n, where n is the sample size. We show that, unlike earlier approaches, our method scales to massive data sets: the point estimate and confidence intervals can be constructed in O(np) time.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to develop techniques for confidence intervals that can reach the parametric rate in non - regular estimation problems. Specifically, when the maximizing parameter is not unique, traditional asymptotic techniques usually fail. The authors propose a technique to develop confidence intervals of the parametric rate in these non - regular settings and prove that when the maximizing parameter is unique, their estimator is asymptotically efficient, which means that regular estimation can be carried out in this case.
### Background and Problem Description of the Paper
The paper "Parametric - Rate Inference for One - Sided Differentiable Parameters" was written by Alexander R. Luedtke and Mark J. van der Laan and published in 2018. The paper mainly focuses on how to construct confidence intervals of the parametric rate in non - regular estimation problems. Specifically, assume that there is a series of parameters indexed by a certain set (which may be infinite - dimensional). Given the data generated from a certain distribution, the goal is to estimate the maximum value of these parameters under this distribution. When the maximizing parameter is not unique, this problem is usually non - regular, and traditional asymptotic techniques usually fail in this case.
### Main Contributions
1. **Technique Development**: The authors propose a technique for developing confidence intervals of the parametric rate in non - regular settings.
2. **Asymptotic Validity**: When the maximizing parameter is unique, the authors prove that their estimator is asymptotically valid.
3. **Application Example**: The authors apply their technique to a recent literature example, that is, reporting the maximum absolute correlation between a predefined result and \(p\) predictor variables.
4. **Computational Efficiency**: The authors' method is still valid when \(p\) grows with the sample size, and point estimates and confidence intervals can be constructed in \(O(np)\) time, which is suitable for large - scale data sets.
### Key Concepts
- **Pathwise Differentiability**: Each parameter \(\Psi_d\) is pathwise differentiable over all distributions in the model.
- **Stabilized One - Step Estimator**: An estimator for non - regular estimation problems, which ensures that the estimator makes full use of the sample size through sample splitting and iterative conditioning.
- **Non - regular Inference**: When the maximizing parameter is not unique, the parameter \(P\mapsto\max_{d\in D_n}\Psi_d(P)\) is not sufficient for standard first - order expansion.
### Application Instance
The authors apply their technique to a specific example, that is, reporting the maximum absolute correlation between a predefined result and \(p\) predictor variables. They show that when \(p\) grows with the sample size, their method is still valid and point estimates and confidence intervals can be constructed in \(O(np)\) time. This makes their method suitable for large - scale data sets.
### Conclusion
This paper provides a technique for developing confidence intervals of the parametric rate in non - regular estimation problems and proves its asymptotic validity under specific conditions. In addition, the authors' method performs well in terms of computational efficiency and is suitable for large - scale data sets. This provides new tools and methods for dealing with non - regular estimation problems in high - dimensional data.