Semantic Segmentation of Unmanned Aerial Vehicle Remote Sensing Images using SegFormer

Vlatko Spasev,Ivica Dimitrovski,Ivan Chorbev,Ivan Kitanovski
2024-10-02
Abstract:The escalating use of Unmanned Aerial Vehicles (UAVs) as remote sensing platforms has garnered considerable attention, proving invaluable for ground object recognition. While satellite remote sensing images face limitations in resolution and weather susceptibility, UAV remote sensing, employing low-speed unmanned aircraft, offers enhanced object resolution and agility. The advent of advanced machine learning techniques has propelled significant strides in image analysis, particularly in semantic segmentation for UAV remote sensing images. This paper evaluates the effectiveness and efficiency of SegFormer, a semantic segmentation framework, for the semantic segmentation of UAV images. SegFormer variants, ranging from real-time (B0) to high-performance (B5) models, are assessed using the UAVid dataset tailored for semantic segmentation tasks. The research details the architecture and training procedures specific to SegFormer in the context of UAV semantic segmentation. Experimental results showcase the model's performance on benchmark dataset, highlighting its ability to accurately delineate objects and land cover features in diverse UAV scenarios, leading to both high efficiency and performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to perform semantic segmentation in unmanned aerial vehicle (UAV) remote - sensing images. Specifically, the paper aims to evaluate and verify the effectiveness and efficiency of the SegFormer framework in the UAV image semantic segmentation task. ### Problem Background 1. **Limitations of Satellite Remote - Sensing Images**: - Satellite remote - sensing images have limitations in resolution and weather sensitivity, which lead to difficulties in ground object recognition. 2. **Advantages of UAV Remote - Sensing**: - UAV remote - sensing obtains image data through low - speed unmanned aerial vehicles, and has higher object resolution and flexibility. It can fly at a lower altitude and provide high - resolution images at the centimeter level, thus collecting low - altitude, high - resolution aerial images more effectively. 3. **Requirement for Semantic Segmentation**: - The semantic segmentation task requires classifying each pixel in the image to generate a detailed segmentation map. This is crucial for accurately analyzing ground objects and their relationships. Traditional machine - learning methods perform poorly when dealing with complex images, so more advanced deep - learning techniques are required to meet this challenge. ### Research Objectives The specific objectives of the paper include: 1. **Evaluating the Effectiveness of the SegFormer Model**: - Use the UA Vid dataset to evaluate the performance of different versions of the SegFormer model (from B0 for real - time applications to B5 for high - performance applications) in the semantic segmentation task. 2. **Measuring the Efficiency of the Model**: - Evaluate the efficiency of different SegFormer variants through indicators such as the number of parameters, frames per second (FPS), and latency. 3. **Exploring Improvement Methods**: - Explore the impact of test - time augmentation and ensemble methods on model performance. ### Main Contributions - Verified the superior performance of SegFormer in the UAV image semantic segmentation task, especially in urban - scene applications. - Provided detailed experimental results, showing the performance of different SegFormer variants on multiple evaluation indicators. - Discussed how to further improve the model's segmentation accuracy through ensemble methods and test - time augmentation. In summary, this paper aims to solve the key problems in UAV remote - sensing image semantic segmentation by introducing and evaluating the SegFormer model, and promote the technological progress in this field.