Abstract:Information criteria provide a cogent approach for identifying models that provide an optimal balance between the competing objectives of goodness-of-fit and parsimony. Models that better conform to a dataset are often more complex, yet such models are plagued by greater variability in estimation and prediction. Conversely, overly simplistic models reduce variability at the cost of increases in bias. Asymptotically efficient criteria are those that, for large samples, select the fitted candidate model whose predictors minimize the mean squared prediction error, optimizing between prediction bias and variability. In the context of prediction, asymptotically efficient criteria are thus a preferred tool for model selection, with the Akaike information criterion (AIC) being among the most widely used. However, asymptotic efficiency relies upon the assumption of a panel of validation data generated independently from, but identically to, the set of training data. We argue that assuming identically distributed training and validation data is misaligned with the premise of prediction and often violated in practice. This is most apparent in a regression context, where assuming training/validation data homogeneity requires identical panels of regressors. We therefore develop a new class of predictive information criteria (PIC) that do not assume training/validation data homogeneity and are shown to generalize AIC to the more practically relevant setting of training/validation data heterogeneity. The analytic properties and predictive performance of these new criteria are explored within the traditional regression framework. We consider both simulated and real-data settings. Software for implementing these methods is provided in the R package, picR , available through CRAN.

A new class of information criteria for improved prediction in the presence of training/validation data heterogeneity

Fast leave-one-cluster-out cross-validation using clustered Network Information Criterion (NICc)

On Statistical Efficiency in Learning

Posterior Averaging Information Criterion

A new integrated discrimination improvement index via odds

Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization

A PAC-Bayesian Perspective on the Interpolating Information Criterion

Gibbs-Based Information Criteria and the Over-Parameterized Regime

Selection of Regression Models under Linear Restrictions for Fixed and Random Designs

An Information Analysis on Modeling Interaction Effects in Logistic Regression

Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC).

Minimization of Akaike's information criterion in linear regression analysis via mixed integer nonlinear program

PanIC: consistent information criteria for general model selection problems

Rebuilding Factorized Information Criterion: Asymptotically Accurate Marginal Likelihood

Why Is My Classifier Discriminatory?

Extending AIC to best subset regression

A note on numerical evaluation of conditional Akaike information for nonlinear mixed-effects models

Predictive Heterogeneity: Measures and Applications

Prediction-based variable selection for component-wise gradient boosting

Consistent Model Selection Procedure for Random Coefficient INAR Models