Hierarchical clustered multiclass discriminant analysis via cross-validation

Kei Hirose,Kanta Miura,Atori Koie
DOI: https://doi.org/10.1016/j.csda.2022.107613
2023-02-01
Abstract:Linear discriminant analysis (LDA) is a well-known method for multiclass classification and dimensionality reduction. However, in general, ordinary LDA does not achieve high prediction accuracy when observations in some classes are difficult to be classified. A novel cluster-based LDA method is proposed that significantly improves prediction accuracy. Hierarchical clustering is adopted, and the dissimilarity measure of two clusters is defined by the cross-validation (CV) value. Therefore, clusters are constructed such that the misclassification error rate is minimized. The proposed approach involves a heavy computational load because the CV value must be computed at each step of the hierarchical clustering algorithm. To address this issue, a regression formulation for LDA is developed and an efficient algorithm that computes an approximate CV value is constructed. The performance of the proposed method is investigated by applying it to both artificial and real datasets. The proposed method provides high prediction accuracy with fast computation from both numerical and theoretical viewpoints.
statistics & probability,computer science, interdisciplinary applications
What problem does this paper attempt to address?