Hybrid Convolutional-Transformer framework for drone-based few-shot weakly supervised object detection

Shengming Li,Linsong Xue,Lin Feng,Cuili Yao,Dong Wang
DOI: https://doi.org/10.1016/j.compeleceng.2022.108154
2022-09-01
Abstract:Drone delivery is becoming a new trend in the logistics system, but few researches are developed in this field. Locating the target buildings in the drone camera is a crucial technique. However, it is difficult to collect extensive drone-view images and their bounding box annotations for supervised training. Therefore, we address this problem by formulating it as a weakly supervised task and using small amount of category labels as supervision. To extract representative features of cross-view and cross-device images, we propose a Hybrid Convolutional-Transformer (HCT) framework for detection given the very few image-level annotated images. To better evaluate the proposed method in the realistic drone delivery task, we build a drone-view object detection dataset based on the University-1652 benchmark by annotating bounding boxes of target buildings. Extensive experimental results demonstrate the effectiveness of the proposed method.
engineering, electrical & electronic,computer science, interdisciplinary applications, hardware & architecture
What problem does this paper attempt to address?