Dual-Stream Feature Aggregation Network for Unmanned Aerial Vehicle Aerial Images Semantic Segmentation

Li Runzeng,Shi Zaifeng,Kong Fanning,Zhao Xiangyang,Luo Tao
DOI: https://doi.org/10.3788/LOP230955
2023-01-01
Laser & Optoelectronics Progress
Abstract:Large object size difference in unmanned aerial vehicle (UAV) aerial photography makes it difficult to take into account the segmentation effect of objects of different sizes in the receptive field. A dual-stream feature aggregation network (DSFA-Net) with two branches to extract low-level and high-level features separately, is proposed for such problems. In the encoder, a low-level information extraction branch with three serial ConvNeXt modules is used to preserve more low-level features by generating more channels of features. In the deep feature branch, the coordinate attention atrous spatial pyramid pooling (CA-ASPP) module reassigns weights to feature maps in the channel dimension. It makes the module focus on segmentation objects of different sizes and deep-level multi-scale features are obtained. During the decoding process, the bilateral guided aggregation module performs resolution aggregation between the low-level and deep-level features. Our method is evaluated on the AeroScapes and Semantic Drone datasets, the mean intersection over union is 83.16% and 72.09% respectively, and the mean pixel accuracy is 90.75% and 80.34% respectively. The proposed method is more capable of segmenting objects with large difference sizes compared to mainstream methods. It is suitable for semantic segmentation tasks for UAV aerial images.
What problem does this paper attempt to address?