A Novel Feature Selection Methodology Based on Outlier Detection Technologies

Gang Chen,Yuanli Cai,Juan Shi
DOI: https://doi.org/10.1109/nlpke.2011.6138226
2011-01-01
Abstract:Feature selection is becoming more and more important for natural language processing as well as knowledge engineering. In this paper, we induce a simple principle that if an attribute subset has more representativeness, then it should be more self-organized, as a result it should be more insensitive to artificially seeded noise points. Based on that, our novel methodology transforms feature selection problems into outlier detection problems. Because of the characteristics of outlier detection problems, our framework can achieve high tolerance of noises, sub-samplings, and even classification errors in training data sets, which are extraordinary features of our method. Moreover, to evaluate the performance of our method comprehensively, we compare our method with several state-of-the-art methods on a number of real-life data sets, and give all the experiment results, which show that our method can accomplish feature reduction tasks with really high accuracy as well as remarkably low computing complexity.
What problem does this paper attempt to address?