Distance-Based Feature Selection from Probabilistic Data

Tingting Zhao,Bin Pei,Suyun Zhao,Hong Chen,Cuiping Li
DOI: https://doi.org/10.1007/978-3-642-38562-9_29
2013-01-01
Abstract:Feature selection is a powerful tool of dimension reduction from datasets. In the last decade, more and more researchers have paid attentions on feature selection. Further, some researchers begin to focus on feature selection from probabilistic datasets. However, in the existing method of feature selection from probabilistic data, the distance hidden in probabilistic data is neglected. In this paper, we design a new distance measure to select informative feature from probabilistic databases, in which both the distance and randomness in the data are considered. And then, we propose a feature selection algorithm based on the new distance and develop two accelerative algorithms to boost the computation. Furthermore, we introduce a parameter into the distance to reduce the sensitivity to noise. Finally, the experimental results verify the effectiveness of our algorithms.
What problem does this paper attempt to address?