Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for Efficient Banana Plantation Segmentation in UAV Imagery

Ang He,Ximei Wu,Xing Xu,Jing Chen,Xiaobin Guo,Sheng Xu
2024-10-09
Abstract:Precise segmentation of Unmanned Aerial Vehicle (UAV)-captured images plays a vital role in tasks such as crop yield estimation and plant health assessment in banana plantations. By identifying and classifying planted areas, crop area can be calculated, which is indispensable for accurate yield predictions. However, segmenting banana plantation scenes requires a substantial amount of annotated data, and manual labeling of these images is both time-consuming and labor-intensive, limiting the development of large-scale datasets. Furthermore, challenges such as changing target sizes, complex ground backgrounds, limited computational resources, and correct identification of crop categories make segmentation even more difficult. To address these issues, we proposed a comprehensive solution. Firstly, we designed an iterative optimization annotation pipeline leveraging SAM2's zero-shot capabilities to generate high-quality segmentation annotations, thereby reducing the cost and time associated with data annotation significantly. Secondly, we developed ALSS-YOLO-Seg, an efficient lightweight segmentation model optimized for UAV imagery. The model's backbone includes an Adaptive Lightweight Channel Splitting and Shuffling (ALSS) module to improve information exchange between channels and optimize feature extraction, aiding accurate crop identification. Additionally, a Multi-Scale Channel Attention (MSCA) module combines multi-scale feature extraction with channel attention to tackle challenges of varying target sizes and complex ground backgrounds.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily addresses the issue of precise segmentation of banana plantation images captured by Unmanned Aerial Vehicles (UAVs). Specifically, the paper aims to solve the following key problems: 1. **High Data Annotation Cost**: - Segmentation of banana plantation scenes requires a large amount of annotated data. Manually annotating these images is both time-consuming and labor-intensive, limiting the development of large-scale datasets. 2. **Target Recognition in Complex Backgrounds**: - Facing challenges such as variations in target size and complex ground backgrounds, existing models struggle to accurately identify crop categories. 3. **Model Deployment in Resource-Constrained Environments**: - In resource-constrained environments (such as UAV platforms), existing models have too many parameters, making direct deployment difficult. To address these issues, the paper proposes the following two main methods: 1. **Iterative Optimization Annotation Pipeline**: - Utilizing the zero-shot capability of the SAM2 model to generate high-quality segmentation masks, reducing the cost and time of data annotation. Through an iterative optimization process, this pipeline significantly reduces the workload of manual annotation and improves segmentation accuracy. 2. **ALSS-YOLO-Seg Model**: - Designing a lightweight and efficient instance segmentation model specifically for banana plantation scenes. This model combines the Adaptive Lightweight Channel Split and Rearrange module (ALSS) and the Multi-Scale Channel Attention module (MSCA) to enhance feature extraction and information exchange capabilities while maintaining low computational overhead in resource-constrained environments. Through the above methods, the paper achieves efficient and precise segmentation of banana plantation images captured by UAVs, providing strong support for agricultural monitoring.