PanIC: Consistent information criteria for general model selection problems

Hien Duy Nguyen
DOI: https://doi.org/10.1111/anzs.12426
2024-11-01
Australian & New Zealand Journal of Statistics
Abstract:Summary Model selection is a ubiquitous problem that arises in the application of many statistical and machine learning methods. In the likelihood and related settings, it is typical to use the method of information criteria (ICs) to choose the most parsimonious among competing models by penalizing the likelihood‐based objective function. Theorems guaranteeing the consistency of ICs can often be difficult to verify and are often specific and bespoke. We present a set of results that guarantee consistency for a class of ICs, which we call PanIC (from the Greek root 'pan', meaning 'of everything'), with easily verifiable regularity conditions. PanICs are applicable in any loss‐based learning problem and are not exclusive to likelihood problems. We illustrate the verification of regularity conditions for model selection problems regarding finite mixture models, least absolute deviation and support vector regression and principal component analysis, and demonstrate the effectiveness of PanICs for such problems via numerical simulations. Furthermore, we present new sufficient conditions for the consistency of BIC‐like estimators and provide comparisons of the BIC with PanIC.
statistics & probability
What problem does this paper attempt to address?