A Common Factor-Analytic Model for Classification.

Mingzhu Sun,Geoffrey J. McLachlan
DOI: https://doi.org/10.1109/bibm.2013.6732722
2013-01-01
Abstract:In this era of data explosion, much research has been directed to the problem of filtering and extracting useful information from extremely large datasets. The focus is on discriminant analysis of high-dimensional data, where the number of dimensions p is very large relative to the number of observations n. Mixture discriminant analysis provides an effective parametric approach, where each class density is modeled using mixtures of common factor analyzers. Although the adoption of mixture models with common factor loadings in the components significantly reduces the number of parameters to be estimated, the number of variables has to be reduced first to a more manageable level. Thus we consider the problem of dimension reduction for high-dimensional data. In this paper, we propose a factor-analytic model with common factor loadings for classification. We apply our model to a breast cancer study involving microarray gene expression data, which shows the parametric approach can select informative genes that improve the prediction of disease outcome.
What problem does this paper attempt to address?