Revisiting the Effective Number Theory for Imbalanced Learning

Ou Wu,Mengyang Li
DOI: https://doi.org/10.1109/tkde.2024.3367949
IF: 9.235
2024-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Imbalanced learning is a traditional yet hot research subarea in machine learning. There are a huge number of imbalanced learning methods proposed in previous literature. This study focuses on one of the most popular imbalanced learning strategies, namely, sample reweighting. The key issue is how to calculate the weights of samples in training. While most studies have relied on intuitive theoretical or heuristic inspirations, few studies have attempted to establish a comprehensive theoretical path for weight calculation. A recent study utilizes the effective number theory for random covering to construct a theoretical weighting framework. In this study, we conduct a deep analysis to theoretically reveal the defects in the existing effective number-based weighting theory. An enhanced effective number theory is established in which data scatter and covering offset among different categories are involved. Subsequently, a new weight calculation manner is proposed based on our new theory, yielding a new loss, namely, NENum loss. In this loss, weights are sample-wise instead of category-wise used in the existing effective number-based weighting. Furthermore, another novel loss that combines weighting and logit perturbation is designed inspired the limitations of the NENum loss. Meta learning is employed to optimize the concrete calculation based on sample-wise training dynamics. We conduct extensive experiments on benchmark imbalanced and standard data corpora. Results validate the reasonableness of our enhanced theory and the effectiveness of the proposed methodology.
computer science, information systems, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?