Abstract:Attribute reduction or attribute subset selection is among the highly important, and essential data pre-processing tasks in all the applications belonging to various domains of engineering that fall under the broad spectrum of artificial intelligence. The process of attribute subset selection and the significance of each selected attribute greatly affect the classification performance of any machine learning algorithm. Rough set theory-based solutions for attribute subset selection have been proven to be very effective for categorical information systems. However, most of those attribute reduction algorithms are serial in nature. They are either inefficient in processing datasets having a very large number of dimensions or their efficiency is overshadowed by high computational costs. Hence, they are becoming inapplicable to the current data processing requirements. To address this problem, we first propose a novel and efficient attribute reduction algorithm named Reduction of Attributes based on Association and Separation (RAAS). This algorithm is based on two measures: the degree of association (DA) of objects within a class and the degree of separation (DS) among objects of different classes. These measures are used for the evaluation of the significance of each attribute as well as the classification ability of each attribute subset. A sequential backward elimination strategy using the DA and the DS is designed to obtain the optimal attribute subset. The RAAS algorithm is evaluated against other typical reduction algorithms over a few publicly available standard datasets from the UCI data repository. The experimental results show that RAAS produces better classification accuracies in comparison to the others. We then designed the parallel version of RAAS, the other proposed algorithm called Parallel Attribute Reduction Algorithm based on Association and Separation (PARAAS) which is both efficient and fast. The PARAAS algorithm is the first algorithm that is designed specifically to perform attribute reduction of larger dimensional categorical datasets on graphics processing units (GPUs) that support CUDA. Experimental analysis suggests that PARAAS has the ability to produce high classification accuracies in significantly low execution times.

Unsupervised Attribute Reduction: Improving Effectiveness and Efficiency

Unsupervised attribute reduction algorithm framework based on spectral clustering and attribute significance function

Novel algorithm for attribute reduction based on mutual-information gain ratio

A Q-learning Approach to Attribute Reduction.

Optimizing Attribute Reduction in Multi-Granularity Data through a Hybrid Supervised–Unsupervised Model

Quick attribute reduct algorithm for neighborhood rough set model

Parallel incremental efficient attribute reduction algorithm based on attribute tree

An Ensemble Framework to Forest Optimization Based Reduct Searching

Similarity-based Attribute Reduction in Rough Set Theory: a Clustering Perspective

Multi-objective Attribute Reduction in Three-Way Decision-Theoretic Rough Set Model

Positive Macroscopic Approximation for Fast Attribute Reduction

Unsupervised attribute reduction based on neighborhood dependency

Efficient and Fast Algorithm for Attribute Reduction of Large Dimensional Data Using Rough Set Theory on Graphics Processing Unit

A novel attribute reduction method based on intuitionistic fuzzy three-way cognitive clustering

A Multi-objective Attribute Reduction Method in Decision-Theoretic Rough Set Model

The Rule-Matching Algorithm of Decision Tree Attribute Reduction

Parallel Selector for Feature Reduction

A Moderate Attribute Reduction Approach In Decision-Theoretic Rough Set

Heuristic Reduction Algorithm Based on Attribute Importance

Attribute Reduction Method Based on Inter-Class Separability*

A Parallel Attribute Reduction Method Based on Classification