Weakly supervised fine-grained recognition based on spatial-channel aware attention filters
Nannan Yu,Lei Huang,Zhiqiang Wei,Wenfeng Zhang,Bin Wang
DOI: https://doi.org/10.1007/s11042-020-10268-y
IF: 2.577
2021-01-01
Multimedia Tools and Applications
Abstract:Fine-grained recognition is a very challenging issue, since it is difficulty to mine discriminative and subtle feature for objects with similar visual appearance. Because massive manual annotations (e.g., bounding box for discriminative regions) are time-consuming and labor-consuming, existing methods designed single form of attention model outputted discriminative regions in a weakly supervised way. In this paper, we proposed a novel method named a Spatial-Channel Aware Attention Filters (SCAF) to address the weakly supervised fine-grained recognition problem. Compared with the previous attention models, SCAF can obtain attentions-aware features from two dimensions, i.e., the spatial location of image and the channel of feature maps. With the proposed SCAF, the model can enhance the discriminative regions on both spatial and channel dimensions simultaneously. In addition, the multi-channel network multi-level structure are designed to extract richer regional features. Moreover, focal loss is introduced to balance the samples’ distribution of fine-grained image dataset. Comprehensive and comparable experiments are conducted in publicly available datasets, and the experimental results show that our method can achieve the state-of-the-art performance on fine-grained recognition tasks. For instance, we achieve 99.370%, 80.749% accuracy on two underwater datasets respectively, i.e., Fish4Knowlege and Wild Fish.