Change-Point Estimation in High Dimensional Regression Models

Bingwen Zhang,Jun Geng,L. Lai
2016-01-01
Abstract:—We consider high dimensional nonhomogeneous linear regression models with pn (cid:57) 0 or p >> n , where p is the number of features and n is the number of observations. In the model considered, the underlying true regression coefficients undergo multiple changes. Our goal is to estimate the number and locations of these change-points and estimate sparse coefficients in each of the intervals between change-points. This paper develops an approach to solve multiple change-points estimation problem in high dimensional linear regression model based on sparse group Lasso (SGL). We analyze the performance of our approach and prove several consistency results. In particular, under certain assumptions and using a properly chosen regularization parameter, we show that the estimation errors of linear coefficients and change-point locations can be expressed as functions of n , p and s , where s is the sparse level of each coefficient. From these functions, we can understand how the estimation errors scale with system parameters and identify conditions on system parameters under which the estimation errors diminish. Furthermore, we show that the estimation of change-points is always overfitting, which eliminates the risk of missing true change-points, and the isolated estimated change-points between true change-points does not occur, which implies that the estimated change-points are clustered around the true change points. We further extend our studies to general linear models (GLM) and prove similar results. Numerical simulations are provided to illustrate the effectiveness of our approach.
What problem does this paper attempt to address?