Provable More Data Hurt in High Dimensional Least Squares Estimator

Li Zeng,Chuanlong Xie,Qinwen Wang
DOI: https://doi.org/10.48550/arxiv.2008.06296
2020-01-01
Abstract:This paper investigates the finite-sample prediction risk of the high-dimensional least squares estimator. We derive the central limit theorem for the prediction risk when both the sample size and the number of features tend to infinity. Furthermore, the finite-sample distribution and the confidence interval of the prediction risk are provided. Our theoretical results demonstrate the sample-wise nonmonotonicity of the prediction risk and confirm "more data hurt" phenomenon.
What problem does this paper attempt to address?