TotalPLS: Local Dimension Reduction for Multicategory Microarray Data

Wenjie You,Zijiang Yang,Mingshun Yuan,Guoli Ji
DOI: https://doi.org/10.1109/thms.2013.2288777
2014-01-01
IEEE Transactions on Human-Machine Systems
Abstract:Dimension reduction is an important topic in data mining, which is widely used in the areas of genetics, medicine, and bioinformatics. We propose a new local dimension reduction algorithm TotalPLS that operates in a unified partial least squares (PLS) framework and implement an information fusion of PLS-based feature selection and feature extraction. This paper focuses on extracting the potential structure hidden in high-dimensional multicategory microarray data, and interpreting and understanding the results provided by the potential structure information. First, we propose using PLS-based recursive feature elimination (PLSRFE) in multicategory problems. Then, we perform feature importance analysis based on PLSRFE for high-dimensional microarray data to determine the information feature (biomarkers) subset, which relates to the studied tumor subtypes problem. Finally, PLS-based supervised feature extraction is conducted on the selected specific genes subset to extract comprehensive features that best reflect the nature of classification to have a discriminating ability. The proposed algorithm is compared with several state-of-the-art methods using multiple high-dimensional multicategory microarray datasets. Our comparison is performed in terms of recognition accuracy, relevance, and redundancy. Experimental results show that the algorithm proposed by us can improve the recognition rate and computational efficiency. Furthermore, mining potential structure information improves the interpretability and understandability of recognition results. The proposed algorithm can be effectively applied to microarray data analysis for the discovery of gene coexpression and coregulation.
What problem does this paper attempt to address?