Mix-Up Augmentation for Oracle Character Recognition with Imbalanced Data Distribution

Jing Li,Qiu-Feng Wang,Rui Zhang,Kaizhu Huang
DOI: https://doi.org/10.1007/978-3-030-86549-8_16
2021-01-01
Abstract:Oracle bone characters are probably the oldest hieroglyphs in China. It is of significant impact to recognize such characters since they can provide important clues for Chinese archaeology and philology. Automatic oracle bone character recognition however remains to be a challenging problem. In particular, due to the inherited nature, oracle characters are typically very limited and also seriously imbalanced in most available oracle datasets, which greatly hinders the research in automatic oracle bone character recognition. To alleviate this problem, we propose to design the mix-up strategy that leverages information from both majority and minority classes to augment samples of minority classes such that their boundaries can be pushed away towards majority classes. As a result, the training bias resulted from majority classes can be largely reduced. In addition, we consolidate our new framework with both the softmax loss and triplet loss on the augmented samples which proves able to improve the classification accuracy further. We conduct extensive evaluations w.r.t. both total class accuracy and average class accuracy on three benchmark datasets (i.e., Oracle-20K, Oracle-AYNU and OBC306). Experimental results show that the proposed method can result in superior performance to the comparison approaches, attaining a new state of the art in oracle bone character recognition.
What problem does this paper attempt to address?