On LASSO for predictive regression

Ji Hyung Lee,Zhentao Shi,Zhan Gao
DOI: https://doi.org/10.1016/j.jeconom.2021.02.002
IF: 3.363
2021-03-01
Journal of Econometrics
Abstract:Explanatory variables in a predictive regression typically exhibit low signal strength and various degrees of persistence. Variable selection in such a context is of great importance. In this paper, we explore the pitfalls and possibilities of the LASSO methods in this predictive regression framework. In the presence of stationary, local unit root, and cointegrated predictors, we show that the adaptive LASSO cannot asymptotically eliminate all cointegrating variables with zero regression coefficients. This new finding motivates a novel post-selection adaptive LASSO, which we call the twin adaptive LASSO (TAlasso), to restore variable selection consistency. Accommodating the system of heterogeneous regressors, TAlasso achieves the well-known oracle property. In contrast, conventional LASSO fails to attain coefficient estimation consistency and variable screening in all components simultaneously. We apply these LASSO methods to evaluate the short- and long-horizon predictability of S&P 500 excess returns.
economics,social sciences, mathematical methods,mathematics, interdisciplinary applications
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the effectiveness and limitations of using the LASSO method for variable selection in predictive regression. Specifically, the author explores whether the Adaptive LASSO (Alasso) method can effectively exclude all cointegrated variables with zero regression coefficients in the presence of stationary, local unit root, and cointegrated predictors. The study finds that the traditional Adaptive LASSO method cannot completely eliminate these inactive cointegrated variables, which prompts the author to propose a new post - selection Adaptive LASSO method, called "Twin Adaptive LASSO" (TAlasso), to restore the consistency of variable selection. ### Background and Motivation of the Paper 1. **Challenges in Predictive Regression**: - Predictive regression is widely used in empirical finance, especially in stock return regression. The main challenges faced by this type of regression include test size distortion caused by highly persistent predictors, low signal - to - noise ratio (SNR), and the fact that many variables actually have no predictive ability due to market competitiveness. - The development of machine learning techniques has provided valuable opportunities for economic data analysis. Especially in the big data era, shrinkage methods (such as LASSO) have become increasingly popular because of their variable selection and regularization characteristics. 2. **Heterogeneity of Time - Series Predictors**: - Time - series predictors have different degrees of persistence, such as short - term memory (such as Treasury bills) and high persistence (such as most financial/macroeconomic predictors). - Multiple persistent predictors may form a cointegrated relationship, such as the dividend - price ratio (DP ratio) and the so - called cay data (Lettau and Ludvigson, 2001). ### Main Contributions 1. **Research and Improvement of Adaptive LASSO in Time Series**: - The author studies the performance of Adaptive LASSO in the time - series context and finds its limitations in dealing with mixed - root models (i.e., predictors with different persistences). - A new method - Twin Adaptive LASSO (TAlasso) is proposed to improve the consistency of variable selection through two Adaptive LASSO processes. 2. **Theoretical Results**: - It is proved that TAlasso can achieve the Oracle property in the mixed - root model, that is, in the case of known relevant variables, the convergence rate of the estimate is the same as that of OLS, and the variable selection is consistent. - This result is established for the first time in the non - stationary time - series context, filling the gap in the existing literature. ### Empirical Applications - The author verifies the finite - sample performance of TAlasso through Monte Carlo simulations and real - data applications. The results show that TAlasso performs well in selecting the correct model, thereby improving the prediction accuracy. - Using 12 predictors provided by Welch and Goyal (2008) to predict the returns of the S&P500, TAlasso outperforms other methods and remains robust in different estimation windows and prediction horizons. ### Conclusions Through theoretical analysis and empirical research, this paper shows the potential problems and solutions of using the LASSO method for variable selection in predictive regression. The proposed TAlasso method shows superior performance in dealing with mixed - root models and provides an effective tool for financial and macroeconomic forecasting.