Active Sample Learning and Feature Selection: A Unified Approach

Changsheng Li,Xiangfeng Wang,Weishan Dong,Junchi Yan,Qingshan Liu,H. Zha
2015-03-03
Abstract:This paper focuses on the problem of simultaneous sample and feature selection for machine learning in a fully unsupervised setting. Though most existing works tackle these two problems separately that derives two well-studied sub-areas namely active learning and feature selection, a unified approach is inspirational since they are often interleaved with each other. Noisy and high-dimensional features will bring adverse effect on sample selection, while `good' samples will be beneficial to feature selection. We present a unified framework to conduct active learning and feature selection simultaneously. From the data reconstruction perspective, both the selected samples and features can best approximate the original dataset respectively, such that the selected samples characterized by the selected features are very representative. Additionally our method is one-shot without iteratively selecting samples for progressive labeling. Thus our model is especially suitable when the initial labeled samples are scarce or totally absent, which existing works hardly address particularly for simultaneous feature selection. To alleviate the NP-hardness of the raw problem, the proposed formulation involves a convex but non-smooth optimization problem. We solve it efficiently by an iterative algorithm, and prove its global convergence. Experiments on publicly available datasets validate that our method is promising compared with the state-of-the-arts.
Computer Science
What problem does this paper attempt to address?