Consensus Analysis of Multiple Classifiers Using Non-Repetitive Variables: Diagnostic Application to Microarray Gene Expression Data

Zhenqiang Su,Huixiao Hong,Roger Perkins,Xueguang Shao,Wensheng Cai,Weida Tong
DOI: https://doi.org/10.1016/j.compbiolchem.2007.01.001
IF: 3.737
2007-01-01
Computational Biology and Chemistry
Abstract:Class prediction based on DNA microarray data has been emerged as one of the most important application of bioinformatics for diagnostics/prognostics. Robust classifiers are needed that use most biologically relevant genes embedded in the data. A consensus approach that combines multiple classifiers has attributes that mitigate this difficulty compared to a single classifier. A new classification method named as consensus analysis of multiple classifiers using non-repetitive variables (CAMCUN) was proposed for the analysis of hyper-dimensional gene expression data. The CAMCUN method combined multiple classifiers, each of which was built from distinct, non-repeated genes that were selected for effectiveness in class differentiation. Thus, the CAMCUN utilized most biologically relevant genes in the final classifier. The CAMCUN algorithm was demonstrated to give consistently more accurate predictions for two well-known datasets for prostate cancer and leukemia. Importantly, the CAMCUN algorithm employed an integrated 10-fold cross-validation and randomization test to assess the degree of confidence of the predictions for unknown samples.
What problem does this paper attempt to address?