A Naïve Bayes Regularized Logistic Regression Estimator for Low-dimensional Classification

Yi Tan,Ben Sherwood,Prakash P. Shenoy
DOI: https://doi.org/10.1016/j.ijar.2024.109239
IF: 4.452
2024-06-22
International Journal of Approximate Reasoning
Abstract:To reduce the estimator's variance and prevent overfitting, regularization techniques have attracted great interest from the statistics and machine learning communities. Most existing regularized methods rely on the sparsity assumption that a model with fewer parameters predicts better than one with many parameters. This assumption works particularly well in high-dimensional problems. However, the sparsity assumption may not be necessary when the number of predictors is relatively small compared to the number of training instances. This paper argues that shrinking the coefficients towards a low-variance data-driven estimate could be a better regularization strategy for such situations. For low-dimensional classification problems, we propose a naïve Bayes regularized logistic regression (NBRLR) that shrinks the logistic regression coefficients toward the naïve Bayes estimate to provide a reduction in variance. Our approach is primarily motivated by the fact that naïve Bayes is functionally equivalent to logistic regression if naïve Bayes' conditional independence assumption holds. Under standard conditions, we prove the consistency of the NBRLR estimator. Extensive simulation and empirical experimental results show that NBRLR is a competitive alternative to various state-of-the-art classifiers, especially on low-dimensional datasets.
computer science, artificial intelligence
What problem does this paper attempt to address?