Semi-supervised feature selection based on label propagation and subset selection

Yun Liu,Feiping Nie,Jigang Wu,Lihui Chen
DOI: https://doi.org/10.1109/ICCIA.2010.6141595
2010-01-01
Abstract:In practice, the data to be handled are often high dimensional, and labeled data are often very limited while a large numbers of unlabeled data can be easily collected. Feature selection is an important method to deal with high dimensional data. In this paper, we propose a novel semi-supervised feature selection algorithm to select relevant features using both labeled and unlabeled data. Specifically, the algorithm explores the distribution of the labeled and unlabeled data with a special label propagation method to obtain the soft labels of unlabeled data, then an efficient algorithm to optimize the trace ratio criterion is used to directly select the optimal feature subset. Experimental results verify the effectiveness of the proposed algorithm, and show significant improvement over traditional supervised feature selection algorithms.
What problem does this paper attempt to address?