Dynamic Perception Framework for Fine-Grained Recognition
Yao Ding,Zhenjun Han,Yanzhao Zhou,Yi Zhu,Jie Chen,Qixiang Ye,Jianbin Jiao
DOI: https://doi.org/10.1109/tcsvt.2021.3069835
IF: 5.859
2022-03-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Fine-grained recognition poses the challenge of discriminating categories with only small subtle visual differences, which can be easily overwhelmed by diverse appearance within categories. Conventional approaches generally locate discriminative parts and then recognize the part-based features. However, we find that tuning the effective receptive field (ERF) of the network to the task plays the key role, which enables significant regions to contribute more to the output. Inspired by the receptive field stimulation mechanism of the visual cortex, we propose a Dynamic Perception framework as a solution. Our framework adapts the ERF by considering the image space and the kernel space simultaneously. In the image space, the Spatial Selective Sampling module is adopted to enlarge informative regions locally. In the kernel space, Spatial Selective Kernel convolution is introduced to adapt different kernel sizes for regions of interest and backgrounds by embedding spatial attention in the multi-path convolution. Extensive experiments on challenging benchmarks, including CUB-200-2011, FGVC-Aircraft, and Stanford Cars, demonstrate that our method yields a performance boost over the state-of-the-art methods.
engineering, electrical & electronic