Leveraging Attention-Based Visual Clue Extraction for Image Classification

Yunbo Cui,Youtian Du,Xue Wang,Hang Wang,Chang Su
DOI: https://doi.org/10.1049/ipr2.12280
IF: 2.3
2021-01-01
IET Image Processing
Abstract:Deep learning-based approaches have made considerable progress in image classification tasks, but most of the approaches lack interpretability, especially in revealing the decisive information causing the categorization of images. This paper seeks to answer the question of what clues encode the discriminative visual information between image categories and can help improve the classification performance. To this end, an attention-based clue extraction network (ACENet) is introduced to mine the decisive local visual information for image classification. ACENet constructs a clue-attention mechanism, that is global-local attention, between the image and visual clue proposals extracted from it and then introduces a contrastive loss defined over the achieved discrete attention distribution to increase the discriminability of clue proposals. The loss encourages considerable attention to be devoted to discriminative clue proposals, that is those similar within the same category and dissimilar across categories. The experimental results for the Negative Web Image (NWI) dataset and the public ImageNet2012 dataset demonstrate that ACENet can extract true clues to improve the image classification performance and outperforms the baselines and the state-of-the-art methods.
What problem does this paper attempt to address?