Learning from Imbalanced and Incomplete Supervision with Its Application to Ride-Sharing Liability Judgment
Lan-Zhe Guo,Zhi Zhou,Jie-Jing Shao,Qi Zhang,Feng Kuang,Gao-Le Li,Zhang-Xun Liu,Guo-Bin Wu,Nan Ma,Qun Li,Yu-Feng Li
DOI: https://doi.org/10.1145/3447548.3467305
2021-01-01
Abstract:In multi-label tasks, sufficient and class-balanced label is usually hard to obtain, which makes it challenging to train a good classifier. In this paper, we consider the problem of learning from imbalanced and incomplete supervision, where only a small subset of labeled data is available and the label distribution is highly imbalanced. This setting is of importance and commonly appears in a variety of real applications. For instance, considering the ride-sharing liability judgment task, liability disputes usually due to a variety of reasons, however, it is expensive to manually annotate the reasons, meanwhile, the distribution of reason is often seriously imbalanced. In this paper, we present a systemic framework Limi consisting of three sub-steps, that is, Label Separating, Correlation Mining and Label Completion. Specifically, we propose an effective two-classifier strategy to separately tackle head and tail labels so as to alleviate the performance degradation on tail labels while maintaining high performance on head labels. Then, a novel label correlation network is adopted to explore the label relation knowledge with flexible aggregators. Moreover, the Limi framework completes the label on unlabeled instances in a semi-supervised fashion. The framework is general, flexible, and effective. Extensive experiments on diverse applications, such as the ride-sharing liability judgment task from Didi and various benchmark tasks, demonstrate that our solution is clearly better than many competitive methods.