A Hybrid Improved Ant Colony Optimization and Random Forests Feature Selection Method for Microarray Data

Wen Xiong,Cong Wang
DOI: https://doi.org/10.1109/ncm.2009.66
2009-01-01
Abstract:Microarray gene expression data have been used in cancer discovery and prediction characterized by their small samples and large dimensionality. This paper proposes a hybrid method based on improved Ant Colony Optimization (ACO) and Random Forests (RF) for selecting a small set of marker genes from microarray data to produce high accuracy cancer classifier. The method preselects top-ranked features using a statistic t-test combined with feature importance score estimated by Random Forests. It uses the combined score as heuristic info and the classification accuracy of Random Forests as positive feedback for ant colony to refine the feature subset preselected. In order to accelerate convergence of ant colony, it distributes ants to different features and confines the size of solution to obtain quickly optimum and near-optimum. As a post processing, it employs restricted sequential forward selection (SFS) to construct optimum from near-optimum. Experiments show the method proposed provides higher recognition with smaller feature subset on two microarray gene expression data.
What problem does this paper attempt to address?