Abstract:Abstract In the past decades, the rapid growth of computer and database technologies has led to the rapid growth of large-scale datasets. On the other hand, data mining applications with high dimensional datasets that require high speed and accuracy are rapidly increasing. Semi-supervised learning is a class of machine learning in which unlabeled data and labeled data are used simultaneously to improve feature selection. The goal of feature selection over partially labeled data (semi-supervised feature selection) is to choose a subset of available features with the lowest redundancy with each other and the highest relevancy to the target class, which is the same objective as the feature selection over entirely labeled data. This method actually used the classification to reduce ambiguity in the range of values. First, the similarity values of each pair are collected, and then these values are divided into intervals, and the average of each interval is determined. In the next step, for each interval, the number of pairs in this range is counted. Finally, by using the strength and similarity matrices, a new constraint feature selection ranking is proposed. The performance of the presented method was compared to the performance of the state-of-the-art, and well-known semi-supervised feature selection approaches on eight datasets. The results indicate that the proposed approach improves previous related approaches with respect to the accuracy of the constrained score. In particular, the numerical results showed that the presented approach improved the classification accuracy by about 3% and reduced the number of selected features by 1%. Consequently, it can be said that the proposed method has reduced the computational complexity of the machine learning algorithm despite increasing the classification accuracy.

$$\Hbox {u}^2\hbox {f}^2\hbox {S}^2$$ U 2 F 2 S 2 : Uncovering Feature-level Similarities for Unsupervised Feature Selection.

U^2F^2S^2 : Uncovering Feature-level Similarities for Unsupervised Feature Selection

Unsupervised Feature Selection with Ordinal Locality.

An Unsupervised Feature Selection Method Based on Improved ReliefF and Bisecting K-means

A New Unsupervised Feature Selection Algorithm Using Similarity-Based Feature Clustering.

Rethinking Embedded Unsupervised Feature Selection: A Simple Joint Approach

Unsupervised feature selection with high-order similarity learning

A Constrained Feature Selection Approach Based on Feature Clustering and Hypothesis Margin Maximization

A Feature Selection Framework Based on Supervised Data Clustering

Simultaneous local clustering and unsupervised feature selection via strong space constraint

Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDA

Unsupervised feature selection for multi-cluster data

Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection

A Contrast Based Feature Selection Algorithm for High-dimensional Data set in Machine Learning

A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty

Feature Selection with Attributes Clustering by Maximal Information Coefficient

Multiview Data Clustering with Similarity Graph Learning Guided Unsupervised Feature Selection

Dependence Guided Unsupervised Feature Selection

Efficient and Stable Unsupervised Feature Selection Based on Novel Structured Graph and Data Discrepancy Learning

Unsupervised Feature Selection Via Metric Fusion and Novel Low-Rank Approximation

Feature Selection Based on Data Clustering