Ultra-High Dimensional Model Averaging for Multi-Categorical Response

Guo, Chaohui
DOI: https://doi.org/10.1007/s40304-023-00379-x
2024-06-27
Communications in Mathematics and Statistics
Abstract:Model averaging has been considered to be a powerful tool for model-based prediction in the past decades. However, its application in ultra-high dimensional multi-categorical data is faced with challenges arising from the model uncertainty and heterogeneity. In this article, a novel two-step model averaging method is proposed for multi-categorical response when the number of covariates is ultra-high. First, a class of adaptive multinomial logistic regression candidate models are constructed where different covariates for each category are allowed to accommodate heterogeneity. Second, the optimal model weights is chosen by applying the Kullback–Leibler loss plus a penalty term. We show that the proposed model averaging estimator is asymptotically optimal by achieving the minimum Kullback–Leibler loss among all possible averaging estimators. Empirical evidences from simulation studies and a real data example demonstrate that the proposed model averaging method has superior performance to the state-of-the-art approaches.
mathematics
What problem does this paper attempt to address?