Context-Aware Content Interaction: Grasp Subtle Clues for Fine-Grained Aircraft Detection
Xueru Xu,Zhong Chen,Xiaolei Zhang,Guoyou Wang
DOI: https://doi.org/10.1109/tgrs.2024.3464851
IF: 8.2
2024-10-08
IEEE Transactions on Geoscience and Remote Sensing
Abstract:The fine-grained object detection, capable of identifying subcategories or types, is thriving in remote sensing scenes. In practice, most existing fine-grained detectors are derived from the two-stage R-CNN paradigm with intricate anchor boxes, focusing on refining features of region of interest (RoI) to boost performance, which often incurs a redundant process. In contrast, the one-stage, anchor-free paradigm possesses a simple yet effective pipeline, but its exploration in fine-grained detections is still far from sufficient. In this article, we propose a one-stage, anchor-free fine-grained detector for remote sensing aircraft recognition. We initially delve into predominant issues when extending the one-stage framework to conduct fine-grained detections, typified by severe interclass confusion and inferior performance in rare categories. Then, we design a fine-grained classification branch, including a region-to-region context distributor (R2CD), a class-aware decoupled focal loss (CDFL), and a cross-shaped sample space (CS3), to address these hindrances. Specifically, the R2CD flexibly integrates the sparse attention mechanism with mask prediction operations to conduct region-level content interactions separately within the foreground and background of feature maps, significantly alleviating the interclass confusion by enhancing subtle features; the CDFL employs dynamic modulation factors driven by optimization gradients to regulate loss contributions across categories while optimizing category-specific heatmaps, thus prioritizing rare categories with hard samples; the CS3 attains a preferable assignment strategy of positive and negative samples by incorporating structure prior, facilitating the capture of foreground features. Extensive experiments conducted on the MAR20 and FAIRPlane11 datasets demonstrate that our model excels at distinguishing fine-grained categories and is well-suited for performing fine-grained detection tasks.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics