Dictionary Learning for Unsupervised Feature Selection Via Dual Sparse Regression

Wu Jian-Sheng,Liu Jing-Xin,Wu Jun-Yun,Huang Wei
DOI: https://doi.org/10.1007/s10489-023-04480-0
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:With unlabeled and high-dimensional data explosion, unsupervised feature selection has become an essential step in many machine learning and data mining tasks. Many dictionary learning based models have been successfully developed for unsupervised feature selection in recent years. These models learn an over-complete dictionary to investigate more data distribution information. However, over-complete dictionary learning will generate redundancy in the latent representations for data. Moreover, if data contain noise, dictionary learning will also yield noise in the latent representations. In this paper, we propose a novel unsupervised feature selection framework, named dictionary learning for unsupervised feature selection via dual sparse regression. In this model, dictionary learning is first embedded into a sparse regression to learn an over-complete dictionary with sparse representations for data, in which the redundancy and noise are eliminated. The data are then projected to the representations to evaluate the significance of features using the other sparse regression. We also offer an efficient algorithm to solve this problem and theoretically analyze its convergence and computational complexity, which is proportional to the data dimensionality. Finally, the evaluation results with the k-means task utilizing the selected features on 9 benchmark datasets demonstrate the superiority of our approach in terms of effectiveness and efficiency.
What problem does this paper attempt to address?