A Novel Spark-Based Attribute Reduction and Neighborhood Classification for Rough Evidence
Weiping Ding,Ying Sun,Ming Li,Jun Liu,Hengrong Ju,Jiashuang Huang,Chin-Teng Lin
DOI: https://doi.org/10.1109/tcyb.2022.3208130
IF: 11.8
2022-01-01
IEEE Transactions on Cybernetics
Abstract:Neighborhood classification (NEC) algorithms have been widely used to solve classification problems. Most traditional NEC algorithms employ the majority voting mechanism as the basis for final decision making. However, this mechanism hardly considers the spatial difference and label uncertainty of the neighborhood samples, which may increase the possibility of the misclassification. In addition, the traditional NEC algorithms need to load the entire data into memory at once, which is computationally inefficient when the size of the dataset is large. To address these problems, we propose a novel Spark-based attribute reduction and NEC for rough evidence in this article. Specifically, we first construct a multigranular sample space using the parallel undersampling method. Then, we evaluate the significance of attribute by neighborhood rough evidence decision error rate and remove the redundant attribute on different samples subspaces. Based on this attribute reduction algorithm, we design a parallel attribute reduction algorithm which is able to compute equivalence classes in parallel and parallelize the process of searching for candidate attributes. Finally, we introduce the rough evidence into the classification decision of traditional NEC algorithms and parallelize the classification decision process. Furthermore, the proposed algorithms are conducted in the Spark parallel computing framework. Experimental results on both small and large-scale datasets show that the proposed algorithms outperform the benchmarking algorithms in the classification accuracy and the computational efficiency.
automation & control systems,computer science, cybernetics, artificial intelligence