A general framework of online updating variable selection for generalized linear models with streaming datasets

Xiaoyu Ma,Lu Lin,Yujie Gai
DOI: https://doi.org/10.1080/00949655.2022.2107207
IF: 1.225
2022-08-10
Journal of Statistical Computation and Simulation
Abstract:In the era of big data, one of the important issues is how to recover the sets of true features when the data sets arrive sequentially. The paper presents a general framework for online updating variable selection and parameter estimation in generalized linear models with streaming datasets. This is a type of online updating penalized likelihoods with differentiable or non-differentiable penalty functions. An online updating coordinate descent algorithm is proposed for solving the online updating optimization problem. Moreover, a tuning parameter selection is suggested in an online updating way. The selection and estimation consistencies and the oracle property are established, theoretically. Our methods are further examined and illustrated by various numerical examples from both simulation experiments and a real data analysis.
statistics & probability,computer science, interdisciplinary applications
What problem does this paper attempt to address?