Incremental feature selection for large-scale hierarchical classification with the arrival of new samples

Yang Tian,Yanhong She
DOI: https://doi.org/10.1007/s10489-024-05352-x
IF: 5.3
2024-03-16
Applied Intelligence
Abstract:In the era of big data, the amount of class labels is growing rapidly, which poses a great challenge to classification tasks. The hierarchical classification was thus introduced to address this issue by considering the structural information between different class labels. In this paper, we propose an incremental feature selection algorithm for handling the arrival of new samples by using the theory of fuzzy rough sets. As a preliminary step, we propose a non-incremental hierarchical feature selection algorithm, which is an improved version of the existing method. Then utilizing the sibling strategy, the incremental calculation of the dependency degree at the arrival of samples is discussed. Based on the analysis of dependency degree change, we design feature addition and deletion strategies, as well as the incremental feature selection algorithm. In the experimental section, two versions of algorithms are designed. The experimental results show that our improvement of the existing method is highly effective and can significantly accelerate the process of feature selection. In addition, version 2 of the incremental algorithm exhibits much higher efficiency than the improved non-incremental algorithm on several datasets, as well as the existing method. Compared to six hierarchical feature selection algorithms, our algorithm achieves better results on the classification accuracy and three hierarchical evaluation metrics. The effectiveness and efficiency of version 1 are also verified by the comparison of version 2 and other results.
computer science, artificial intelligence
What problem does this paper attempt to address?