Hierarchical Self-Distilled Feature Learning for Fine-Grained Visual Categorization
Yutao Hu,Xiaolong Jiang,Xuhui Liu,Xiaoyan Luo,Yao Hu,Xianbin Cao,Baochang Zhang,Jun Zhang
DOI: https://doi.org/10.1109/tnnls.2021.3124135
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Fine-grained visual categorization (FGVC) relies on hierarchical features extracted by deep convolutional neural networks (CNNs) to recognize closely alike objects. Particularly, shallow layer features containing rich spatial details are vital for specifying subtle differences between objects but are usually inadequately optimized due to gradient vanishing during backpropagation. In this article, hierarchical self-distillation (HSD) is introduced to generate well-optimized CNNs features for accurate fine-grained categorization. HSD inherits from the widely applied deep supervision and implements multiple intermediate losses for reinforced gradients. Besides that, we observe that the hard (one-hot) labels adopted for intermediate supervision hurt the performance of FGVC by enforcing overstrict supervision. As a solution, HSD seeks self-distillation where soft predictions generated by deeper layers of the network are hierarchically exploited to supervise shallow parts. Moreover, self-information entropy loss (SIELoss) is designed in HSD to adaptively soften intermediate predictions and facilitate better convergence. In addition, the gradient detached fusion (GDF) module is incorporated to produce an ensemble result with multiscale features via effective feature fusion. Extensive experiments on four challenging fine-grained datasets show that, with neglectable parameter increase, the proposed HSD framework and the GDF module both bring significant performance gains over different backbones, which also achieves state-of-the-art classification performance.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture