Value-Aware Resampling and Loss for Imbalanced Classification

Li Sun,Jie Song,Cheng Hua,Chengchao Shen,Mingli Song
DOI: https://doi.org/10.1145/3207677.3278084
2018-01-01
Abstract:Existing(1) machine learning methods usually treat training samples equally, and their performance degrades significantly when facing imbalanced training data. This paper introduces Value-Aware Resampling and Loss (VARL) to tackle the imbalanced classification problem, where high-value samples play a more important role than those of low-value samples in the model training process. Specifically, the training value of each training sample is assessed according to its predicted probability of ground truth label, and then training samples are resampled to produce a balanced training set, at last the model training is further boosted by using an instance-level value-aware loss function. To conduct fair comparisons among different methods, we compile 13 datasets for imbalanced classification. Experiments demonstrate that our proposed method can effectively measure the training value of training samples, and achieve superior performance in imbalanced classification compared with several existing methods.
What problem does this paper attempt to address?