BiFuG2-Spark: Bi-directional Fuzzy Granular-Cabin Parallel Attribute Reduction Accelerator with Granular-Group Collaboration
Hengrong Ju,Tingting Shan,Weiping Ding,Keyu Liu,Muhammad Jabir Khan,Jiashuang Huang,Xibei Yang
DOI: https://doi.org/10.1109/tfuzz.2024.3392328
IF: 12.253
2024-01-01
IEEE Transactions on Fuzzy Systems
Abstract:In the era of Big Data, data are being collected, stored, and analyzed at an unprecedented rate. Owing to the limitations of the quantity, diversity, and complexity of data, traditional data reduction techniques cannot effectively remove redundant attributes and reduce data uncertainty. In this article, a bi-directional fuzzy granular-cabin parallel attribute reduction accelerator with granular-group collaboration is proposed. First, the in-memory computing technology of Spark is used to divide the dataset into different subsets to achieve distributed parallel acceleration. Second, a novel bi-directional fuzzy granular-cabin model is constructed using virtual samples, which greatly compresses the fuzzy neighborhood query space. Then, a granular-group collaboration attribute reduction method with the fusion of the proposed granular-cabin and attribute group is investigated, which uses the concept of similarity to divide attributes into different attribute groups, thus reducing the number of iterations for evaluating attributes. Finally, the reduction results of the subnodes are aggregated to the master-node, and the ranking results are evaluated and shown to improve the classification accuracy of attribute reduction. This article presents experiments on eighteen public datasets, including six large-scale datasets. The experimental results show that the proposed method not only reduces the computational cost, but also improves the classification accuracy of the reduced subset.