Abstract:Abstract In the past decades, the rapid growth of computer and database technologies has led to the rapid growth of large-scale datasets. On the other hand, data mining applications with high dimensional datasets that require high speed and accuracy are rapidly increasing. Semi-supervised learning is a class of machine learning in which unlabeled data and labeled data are used simultaneously to improve feature selection. The goal of feature selection over partially labeled data (semi-supervised feature selection) is to choose a subset of available features with the lowest redundancy with each other and the highest relevancy to the target class, which is the same objective as the feature selection over entirely labeled data. This method actually used the classification to reduce ambiguity in the range of values. First, the similarity values of each pair are collected, and then these values are divided into intervals, and the average of each interval is determined. In the next step, for each interval, the number of pairs in this range is counted. Finally, by using the strength and similarity matrices, a new constraint feature selection ranking is proposed. The performance of the presented method was compared to the performance of the state-of-the-art, and well-known semi-supervised feature selection approaches on eight datasets. The results indicate that the proposed approach improves previous related approaches with respect to the accuracy of the constrained score. In particular, the numerical results showed that the presented approach improved the classification accuracy by about 3% and reduced the number of selected features by 1%. Consequently, it can be said that the proposed method has reduced the computational complexity of the machine learning algorithm despite increasing the classification accuracy.

Semi-supervised feature selection based on discernibility matrix and mutual information

Semi-supervised feature selection based on structure and constraints preserving

An Unsupervised Feature Selection Method Based on Improved ReliefF and Bisecting K-means

Feature Selection with Conditional Mutual Information Considering Feature Interaction

U^2F^2S^2 : Uncovering Feature-level Similarities for Unsupervised Feature Selection

Feature Selection Based on Dependency Margin

A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty

Semi-supervised feature selection by minimum neighborhood redundancy and maximum neighborhood relevancy

Locality Sensitive Semi-Supervised Feature Selection

A Constrained Feature Selection Approach Based on Feature Clustering and Hypothesis Margin Maximization

Efficient Semi-Supervised Feature Selection with Noise Insensitive Trace Ratio Criterion

Semisupervised Feature Selection via Structured Manifold Learning

Discriminative embedded unsupervised feature selection.

Sparse semi-supervised multi-label feature selection based on latent representation

A-SFS: Semi-supervised Feature Selection based on Multi-task Self-supervision

Joint Semi-Supervised Feature Selection and Classification Through Bayesian Approach

Semisupervised Feature Selection With Sparse Discriminative Least Squares Regression

A Convex Formulation for Semi-Supervised Multi-Label Feature Selection.

Unsupervised Feature Selection Algorithm Based on Dual Manifold Re-ranking

Adaptive Collaborative Correlation Learning-based Semi-Supervised Multi-Label Feature Selection

Unsupervised feature selection via discrete spectral clustering and feature weights