HCP-MIC at VQA-Med 2020: Effective Visual Representation for Medical Visual Question Answering.

Guanqi Chen,Haifan Gong,Guanbin Li
2020-01-01
Abstract:This paper describes our submission for the Medical Domain Visual Question Answering Task of ImageCLEF 2020. We desert complex cross-modal fusion strategies and concentrate on how to capture the effective visual representation, due to the information inequality between images and questions in this task. Based on the observation of long-tailed distribution in the training set, we utilize the bilateral-branch network with a cumulative learning strategy to tackle this issue. Besides, to alleviate the issue of limited training data, we design an approach to extend the training set by Kullback-Leibler divergence. Our proposed method achieved the score with 0.426 in accuracy and 0.462 in BLEU, which ranked 4th in the competition. Our code is publicly available.
What problem does this paper attempt to address?