Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

Xingche Guo,Yehua Li,Tailen Hsing
2024-08-11
Abstract:High-dimensional functional data have become increasingly prevalent in modern applications such as high-frequency financial data and neuroimaging data analysis. We investigate a class of high-dimensional linear regression models, where each predictor is a random element in an infinite-dimensional function space, and the number of functional predictors $p$ can potentially be ultra-high. Assuming that each of the unknown coefficient functions belongs to some reproducing kernel Hilbert space (RKHS), we regularize the fitting of the model by imposing a group elastic-net type of penalty on the RKHS norms of the coefficient functions. We show that our loss function is Gateaux sub-differentiable, and our functional elastic-net estimator exists uniquely in the product RKHS. Under suitable sparsity assumptions and a functional version of the irrepresentable condition, we derive a non-asymptotic tail bound for variable selection consistency of our method. Allowing the number of true functional predictors $q$ to diverge with the sample size, we also show a post-selection refined estimator can achieve the oracle minimax optimal prediction rate. The proposed methods are illustrated through simulation studies and a real-data application from the Human Connectome Project.
Methodology,Statistics Theory
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily explores the issues of variable selection and minimax prediction rate in high-dimensional Functional Linear Models (FLM). Specifically: 1. **Research Background**: - Modern technology has generated a large amount of high-frequency repeated measurement data, which can be modeled as functional data. - High-dimensional functional data is becoming increasingly common in modern applications, such as high-frequency financial data and neuroimaging data analysis. 2. **Research Objectives**: - Propose a new method for variable selection in high-dimensional functional linear regression models and ensure the consistency of variable selection. - Introduce elastic net penalties to handle high-dimensional functional coefficients, ensuring the sparsity and smoothness of the model. - Derive non-asymptotic tail bounds to guarantee the consistency of variable selection. - Prove that the proposed method can achieve the minimax optimal prediction rate under suitable sparsity assumptions. 3. **Main Contributions**: - Proposed a double penalty method based on the Reproducing Kernel Hilbert Space (RKHS) framework. - Established theoretical results for the consistency of variable selection in high-dimensional functional linear models. - Developed the minimax optimal prediction rate for high-dimensional functional linear models and proved that a post-selection refined estimator can achieve this optimal rate. - Validated the effectiveness of the proposed method through simulation studies and real data applications (e.g., the Human Connectome Project). In summary, this paper aims to address the problem of variable selection in high-dimensional functional linear models and proposes an effective theoretical framework to ensure the consistency of variable selection and the optimization of predictive performance.