Image Captioning Based on Adaptive Balancing Loss.

Linghui Li,Sheng Tang,Junbo Guo,Rui Wang,Bo Lyu,Qi Tian,Yongdong Zhang
DOI: https://doi.org/10.1109/bigmm.2018.8499066
2018-01-01
Abstract:Recently, most of pioneering works based on supervised learning have been proposed for image captioning task. These approaches are heavily dependent on labeled training data. Through careful observation, we note that these approaches suffer from the problem of class imbalance (CIB) which can lead to performance degradation and limit the diversity of generated sentences. In this paper, to address this problem, we propose a pipeline based on an adaptive balancing loss (ABL) for image captioning which re-weighs loss of each category dynamically over the training process. Our proposed method can improve the accuracy and increase the diversity of generated descriptions through adaptively reducing losses of well-classified and frequent categories and increasing losses of under-classified and infrequent categories. We conduct experiments on the well-known MS COCO caption dataset to evaluate the performance of the proposed method. The results show that our approach achieves competitive performance compared to the state-of-the-art methods and can generate more accurate and diverse captions.
What problem does this paper attempt to address?