Ensemble Subset Regression (ENSURE): Efficient High-dimensional Prediction

Rong Zhu,Hua Liang,David Ruppert
DOI: https://doi.org/10.5705/ss.202021.0187
IF: 1.4
2024-01-01
Statistica Sinica
Abstract:In high-dimensional prediction problems, we propose subsampling the predictors prior to the analysis. Specifically, we draw features using random sampling, and then fit a model and make predictions based on the sampled feature subset. This greatly reduces the dimension, storage, and computational bottlenecks. We explore this "subset regression" strategy under a linear regression framework. We propose an ensemble method that combines multiple subset regressions, called the ensemble subset regression (ENSURE) that reduces the uncertainty due to feature sampling. We provide a theoretical upper bound on the excess risk of the predictions computed in the subset regression, and provide theoretical support that the ensemble can improve the performance of the subset regression. Detailed empirical studies demonstrate that ENSURE performs well, better than methods that use all features.
What problem does this paper attempt to address?