ECC + +: An algorithm family based on ensemble of classifier chains for classifying imbalanced multi-label data

Jicong Duan,Yan Gu,Hualong Yu,Xibei Yang,Shang Gao
DOI: https://doi.org/10.1016/j.eswa.2023.121366
IF: 8.5
2023-09-06
Expert Systems with Applications
Abstract:Multi-label learning has attracted a great deal of research interests as it has a wide range of real-world applications. Although many multi-label learning methods have been proposed, very few of them have addressed the problem of class imbalance distribution in multi-label data. Moreover, most of the existing class imbalance multi-label learning algorithms only focus on solving the class imbalance problem, without taking into account the correlations among labels. To address these issues simultaneously, we propose to combine the well-known ensemble of classifier chain (ECC) algorithm with various binary-class imbalance learning techniques such as sampling, cost-sensitive learning, and threshold moving. This approach creates a new algorithm family called ECC++, designed specifically for class imbalance multi-label learning. ECC is already an excellent ensemble high-order binary relevance multi-label learning algorithm that is well-suited to exploiting correlations among labels. Combining it with binary-class imbalance learning techniques enables each link in a classifier chain (CC) to overcome the negative effect of skewed data distribution. ECC++ is a dynamic algorithm family that can be extended arbitrarily by applying any new binary-class imbalance learning techniques. To demonstrate the effectiveness and superiority of the proposed ECC++ algorithm family, we developed several ECC++ family members using some popular binary-class imbalance learning techniques. We then compared them with several state-of-the-art class imbalance multi-label learning algorithms on twelve benchmark and four real-world multi-label datasets. Our experimental results showed the effectiveness and superiority of the proposed ECC++ algorithm family over existing class imbalance multi-label learning algorithms. In conclusion, the proposed ECC++ algorithm family combines the strengths of the well-established ECC algorithm and binary-class imbalance learning techniques, resulting in a superior methodology for class imbalance multi-label learning.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?