Testing the Statistical Significance of an Ultra-High-Dimensional Naive Bayes Classifier

Baiguo An,Hansheng Wang,Jianhua Guo
DOI: https://doi.org/10.2139/ssrn.2039110
2013-01-01
Statistics and Its Interface
Abstract:The naive Bayes approach is one of the most popular methods used for classification. Nevertheless, how to test its statistical significance under an ultra-high-dimensional (UHD) setup is not well understood. To fill this important theoretical gap, we propose a novel testing statistic with a standard normal asymptotic null distribution, even if the predictor dimension is considerably larger than the sample size. This makes the proposed method useful for UHD data analysis. Simulation studies are presented to demonstrate its finite sample performance and a text classification example is described for illustration.
What problem does this paper attempt to address?