Random-matrix Regularized Discriminant Analysis of High-Dimensional Dataset

Peng Liu,Bin Ye,Yangquan Guo,Hanyang Wang,Fei Chu
DOI: https://doi.org/10.1109/dcabes.2018.00060
2018-01-01
Abstract:Linear discriminant analysis (LDA) is one of the most popular parametric classification methods in machine learning and data mining tasks. Although it performs well in many applications, LDA is impractical for high-dimensional data sets. A primary reason for it is that the sample covariance matrix is no longer a good estimator of the actual covariance matrix when the dimension of feature vector p is close to or even larger than the sample size n. Here we propose to regularize LDA classifier by employing a consistent estimator of high-dimensional covariance matrices. Using the theoretical tools from random matrix theory, the covariance matrices in high-dimensions are estimated in a linear or nonlinear shrinkage manner depending on the relationship between the dimension p and the sample size n. Numerical simulations demonstrate that the regularized discriminant analysis using random matrix theory yield higher accuracies than existing competitors for a wide variety of synthetic and real data sets.
What problem does this paper attempt to address?