Feature Selection Via Non-convex Constraint and Latent Representation Learning with Laplacian Embedding
Ronghua Shang,Jiarui Kong,Jie Feng,Licheng Jiao
DOI: https://doi.org/10.1016/j.eswa.2022.118179
IF: 8.5
2022-01-01
Expert Systems with Applications
Abstract:In unsupervised feature selection, the relationship between pseudo-labels is often ignored, and the intercon-nection information between the data is not fully utilized. In order to solve these problems, this paper proposes a feature selection method via non-convex constraint and latent representation learning with Laplacian embedding (NLRL-LE). NLRL-LE keeps the correlation between the pseudo-labels to make the pseudo-label closer to the true label. And it combines with the interconnection information between data, learns the latent representation matrix to guide feature selection. Specifically, first, NLRL-LE regards each pseudo-label as a latent feature of the sample, constructs a latent feature graph, and retains the inherent attributes of the pseudo-labels. Second, latent representation learning is performed in the space which is made up of the latent feature space and data space. Since the latent feature graph retains the correlation between pseudo-labels, latent representation learning considers the interconnection information between data, and the information contained in the latent represen-tation space is more complete. In addition, in order to make full use of pseudo-labels, the learned latent rep-resentation matrix is used as pseudo-label information to provide cluster labels in the latent representation space to guide feature selection. Finally, non-negative and l2,1-2-norm non-convex constraint are applied to the feature transformation matrix. The combination of non-negative constraint and non-convex constraint, compared with convex constraint, can ensure the row sparsity of the feature transformation matrix, select low-redundant fea-tures, and improve the feature selection effect. The experimental results show that the ACC and NMI of the NLRL-LE are better than the other seven compared algorithms on twelve datasets.