Multi-objective Evolutionary Instance Selection for Multi-label Classification
Dingming Liu,Haopu Shang,Wenjing Hong,Chao Qian
DOI: https://doi.org/10.1007/978-3-031-20862-1_40
2022-01-01
Abstract:Multi-label classification is an important topic in machine learning, where each instance can be classified into more than one category, i.e., have a subset of labels instead of only one. Among existing methods, ML-kNN [25], the direct extension of k-nearest neighbors algorithm to the multi-label scenario, has received much attention due to its conciseness, great interpretability, and good performance. However, ML-kNN usually suffers from a terrible storage cost since all training instances need to be saved in the memory. To address this issue, a natural way is instance selection, intending to save the important instances while deleting the redundant ones. However, previous instance selection methods mainly focus on the single-label scenario, which may have a poor performance when adapted to the multi-label scenario. Recently, few works begin to consider the multi-label scenario, but their performance is limited due to the inapposite modeling. In this paper, we propose to formulate the instance selection problem for ML-kNN as a natural bi-objective optimization problem that considers the accuracy and the number of retained instances simultaneously, and adapt NSGA-II to solve it. Experiments on six real-world data sets show that our proposed method can achieve both not worse prediction accuracy and significantly better compression ratio, compared with state-of-the-art methods.