GD-RDA: A New Regularized Discriminant Analysis for High-Dimensional Data
Yan Zhou,Baoxue Zhang,Gaorong Li,Tiejun Tong,Xiang Wan
DOI: https://doi.org/10.1089/cmb.2017.0029
IF: 1.549
2017-11-01
Journal of Computational Biology
Abstract:High-throughput techniques bring novel tools and also statistical challenges to genomic research. Identification of which type of diseases a new patient belongs to has been recognized as an important problem. For high-dimensional small sample size data, the classical discriminant methods suffer from the singularity problem and are, therefore, no longer applicable in practice. In this article, we propose a geometric diagonalization method for the regularized discriminant analysis. We then consider a bias correction to further improve the proposed method. Simulation studies show that the proposed method performs better than, or at least as well as, the existing methods in a wide range of settings. A microarray dataset and an RNA-seq dataset are also analyzed and they demonstrate the superiority of the proposed method over the existing competitors, especially when the number of samples is small or the number of genes is large. Finally, we have developed an R package called "GDRDA" which is available upon request.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology,computer science, interdisciplinary applications,statistics & probability