Abstract:Abstract In the past decades, the rapid growth of computer and database technologies has led to the rapid growth of large-scale datasets. On the other hand, data mining applications with high dimensional datasets that require high speed and accuracy are rapidly increasing. Semi-supervised learning is a class of machine learning in which unlabeled data and labeled data are used simultaneously to improve feature selection. The goal of feature selection over partially labeled data (semi-supervised feature selection) is to choose a subset of available features with the lowest redundancy with each other and the highest relevancy to the target class, which is the same objective as the feature selection over entirely labeled data. This method actually used the classification to reduce ambiguity in the range of values. First, the similarity values of each pair are collected, and then these values are divided into intervals, and the average of each interval is determined. In the next step, for each interval, the number of pairs in this range is counted. Finally, by using the strength and similarity matrices, a new constraint feature selection ranking is proposed. The performance of the presented method was compared to the performance of the state-of-the-art, and well-known semi-supervised feature selection approaches on eight datasets. The results indicate that the proposed approach improves previous related approaches with respect to the accuracy of the constrained score. In particular, the numerical results showed that the presented approach improved the classification accuracy by about 3% and reduced the number of selected features by 1%. Consequently, it can be said that the proposed method has reduced the computational complexity of the machine learning algorithm despite increasing the classification accuracy.

Semi-supervised Minimum Redundancy Maximum Relevance Feature Selection for Audio Classification

Semi-supervised Feature Selection for Audio Classification Based on Constraint Compensated Laplacian Score

CLDA: Feature Selection for Text Categorization Based on Constrained LDA

Semi-supervised feature selection by minimum neighborhood redundancy and maximum neighborhood relevancy

Feature Selection Method on Imbalanced Text

MVMR-FS : Non-parametric feature selection algorithm based on Maximum inter-class Variation and Minimum Redundancy

Feature Selection with Integrated Relevance and Redundancy Optimization

A Constrained Feature Selection Approach Based on Feature Clustering and Hypothesis Margin Maximization

An Efficient Feature Selection in Classification of Audio Files

A New Method for Redundancy Analysis in Feature Selection

Perceptual Similarity Between Audio Clips and Feature Selection for Its Measurement

A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization

Hybrid Independent Component Analysis and Rough Set Approach for Audio Feature Extraction

A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty

An Improved Feature Selection Algorithm for Ordinal Classification.

Aggressive Dimensionality Reduction With Reinforcement Local Feature Selection For Text Categorization

Semi-supervised feature selection based on discernibility matrix and mutual information

Feature Selection with Missing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy

Efficient Semi-Supervised Feature Selection with Noise Insensitive Trace Ratio Criterion

A Method Based on General Model and Rough Set for Audio Classification

Locality Sensitive Semi-Supervised Feature Selection