An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series

Leonardo Di Gangi,M. Lapucci,F. Schoen,A. Sortino
DOI: https://doi.org/10.1007/s10589-019-00134-5
2019-09-21
Computational Optimization and Applications
Abstract:In this paper we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.
mathematics, applied,operations research & management science
What problem does this paper attempt to address?