To Balance or Not to Balance: A Simple-yet-Effective Approach for Learning with Long-Tailed Distributions

Junjie Zhang,Lingqiao Liu,Peng Wang,Chunhua Shen
2019-01-01
Abstract:Real-world visual data often exhibits a long-tailed distribution, where some”head” classes have a large number of samples, yet only a few samples areavailable for ”tail” classes. Such imbalanced distribution causes a greatchallenge for learning a deep neural network, which can be boiled down into adilemma: on the one hand, we prefer to increase the exposure of tail classsamples to avoid the excessive dominance of head classes in the classifiertraining. On the other hand, oversampling tail classes makes the network proneto over-fitting, since head class samples are often consequentlyunder-represented. To resolve this dilemma, in this paper, we propose asimple-yet-effective auxiliary learning approach. The key idea is to split anetwork into a classifier part and a feature extractor part, and then employdifferent training strategies for each part. Specifically, to promote theawareness of tail-classes, a class-balanced sampling scheme is utilised fortraining both the classifier and the feature extractor. For the featureextractor, we also introduce an auxiliary training task, which is to train aclassifier under the regular random sampling scheme. In this way, the featureextractor is jointly trained from both sampling strategies and thus can takeadvantage of all training data and avoid the over-fitting issue. Apart fromthis basic auxiliary task, we further explore the benefit of usingself-supervised learning as the auxiliary task. Without using any bells andwhistles, our model achieves superior performance over the state-of-the-artsolutions.
What problem does this paper attempt to address?