Abstract:Unsupervised feature selection is fundamental in statistical pattern recognition, and has drawn persistent attention in the past several decades. Recently, much work has shown that feature selection can be formulated as nonlinear dimensionality reduction with discrete constraints. This line of research emphasizes utilizing the manifold learning techniques, where feature selection and learning can be studied based on the manifold assumption in data distribution. Many existing feature selection methods such as Laplacian score, SPEC (spectrum decomposition of graph Laplacian), TR (trace ratio) criterion, MSFS (multi-cluster feature selection) and EVSC (eigenvalue sensitive criterion) apply the basic properties of graph Laplacian, and select the optimal feature subsets which best preserve the manifold structure defined on the graph Laplacian. In this paper, we propose a new feature selection perspective from locally linear embedding (LLE), which is another popular manifold learning method. The main difficulty of using LLE for feature selection is that its optimization involves quadratic programming and eigenvalue decomposition, both of which are continuous procedures and different from discrete feature selection. We prove that the LLE objective can be decomposed with respect to data dimensionalities in the subset selection problem, which also facilitates constructing better coordinates from data using the principal component analysis (PCA) technique. Based on these results, we propose a novel unsupervised feature selection algorithm, called locally linear selection (LLS), to select a feature subset representing the underlying data manifold. The local relationship among samples is computed from the LLE formulation, which is then used to estimate the contribution of each individual feature to the underlying manifold structure. These contributions, represented as LLS scores, are ranked and selected as the candidate solution to feature selection. We further develop a locally linear rotation-selection (LLRS) algorithm which extends LLS to identify the optimal coordinate subset from a new space. Experimental results on real-world datasets show that our method can be more effective than Laplacian eigenmap based feature selection methods.

Gradient-based Laplacian Feature Selection

G-Optimal Feature Selection with Laplacian regularization

A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization

Laplacian Score for Feature Selection.

Feature selection for high‐dimensional regression via sparse LSSVR based on Lp‐norm

Optimal Feature Selection for Sparse Linear Discriminant Analysis and Its Applications in Gene Expression Data

Local Sparse Discriminative Feature Selection

Unsupervised Feature Selection Via Joint Local Learning and Group Sparse Regression

Group Sparse Feature Selection on Local Learning Based Clustering.

Deep unsupervised feature selection by discarding nuisance and correlated features

Graph-Based Semi-supervised Feature Selection with Application to Automatic Spam Image Identification

Detecting Local Manifold Structure for Unsupervised Feature Selection

Efficient Feature Selection via $\ell _{2, 0}$ℓ2, 0-norm Constrained Sparse Regression.

Spectral Self-supervised Feature Selection

Gram-Schmidt Methods for Unsupervised Feature Extraction and Selection

Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection

Gradient Boosted Feature Selection.

Feature Gradients: Scalable Feature Selection via Discrete Relaxation

Convex Sparse PCA for Unsupervised Feature Learning.

Locality Sensitive Semi-Supervised Feature Selection