A Cross-Modal Semantic Alignment and Feature Fusion Method for Bionic Drone and Bird Recognition

Hehao Liu,Dong Li,Ming Zhang,Jun Wan,Shuang Liu,Hanying Zhu,Qinghua Liu
DOI: https://doi.org/10.3390/rs16173121
IF: 5
2024-08-25
Remote Sensing
Abstract:With the continuous progress in drone and materials technology, numerous bionic drones have been developed and employed in various fields. These bionic drones are designed to mimic the shape of birds, seamlessly blending into the natural environment and reducing the likelihood of detection. However, such a high degree of similarity also poses significant challenges in accurately distinguishing between real birds and bionic drones. Existing methods attempt to recognize both using optical images, but the visual similarity often results in poor recognition accuracy. To alleviate this problem, in this paper, we propose a cross-modal semantic alignment and feature fusion (CSAFF) network to improve the recognition accuracy of bionic drones. CSAFF aims to introduce motion behavior information as an auxiliary cue to improve discriminability. Specifically, a semantic alignment module (SAM) was designed to explore the consistent semantic information between cross-modal data and provide more semantic cues for the recognition of bionic drones and birds. Then, a feature fusion module (FFM) was developed to fully integrate cross-modal information, which effectively enhances the representability of these features. Extensive experiments were performed on datasets containing bionic drones and birds, and the experimental results consistently show the effectiveness of the proposed CSAFF method in identifying bionic drones and bionic birds.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?