Abstract:Camellia oleifera is a crop of high economic value, yet it is particularly susceptible to various diseases and pests that significantly reduce its yield and quality. Consequently, the precise segmentation and classification of diseased Camellia leaves are vital for managing pests and diseases effectively. Deep learning exhibits significant advantages in the segmentation of plant diseases and pests, particularly in complex image processing and automated feature extraction. However, when employing single-modal models to segment Camellia oleifera diseases, three critical challenges arise: (A) lesions may closely resemble the colors of the complex background; (B) small sections of diseased leaves overlap; (C) the presence of multiple diseases on a single leaf. These factors considerably hinder segmentation accuracy. A novel multimodal model, CNN-Transformer Dual U-shaped Network (CTDUNet), based on a CNN-Transformer architecture, has been proposed to integrate image and text information. This model first utilizes text data to address the shortcomings of single-modal image features, enhancing its ability to distinguish lesions from environmental characteristics, even under conditions where they closely resemble one another. Additionally, we introduce Coordinate Space Attention (CSA), which focuses on the positional relationships between targets, thereby improving the segmentation of overlapping leaf edges. Furthermore, cross-attention (CA) is employed to align image and text features effectively, preserving local information and enhancing the perception and differentiation of various diseases. The CTDUNet model was evaluated on a self-made multimodal dataset compared against several models, including DeeplabV3+, UNet, PSPNet, Segformer, HrNet, and Language meets Vision Transformer (LViT). The experimental results demonstrate that CTDUNet achieved an mean Intersection over Union (mIoU) of 86.14%, surpassing both multimodal models and the best single-modal model by 3.91% and 5.84%, respectively. Additionally, CTDUNet exhibits high balance in the multi-class segmentation of Camellia oleifera diseases and pests. These results indicate the successful application of fused image and text multimodal information in the segmentation of Camellia disease, achieving outstanding performance.

Unsupervised deep metric learning algorithm for crop disease images based on knowledge distillation networks

Crop Disease Identification by Fusing Multiscale Convolution and Vision Transformer.

Unsupervised deep learning techniques for automatic detection of plant diseases: reducing the need of manual labelling of plant images

Pull & Push: Leveraging Differential Knowledge Distillation for Efficient Unsupervised Anomaly Detection and Localization

Knowledge Distillation Facilitates the Lightweight and Efficient Plant Diseases Detection Model

Dual-branch collaborative learning network for crop disease identification

CTDUNet: A Multimodal CNN-Transformer Dual U-Shaped Network with Coordinate Space Attention for Camellia oleifera Pests and Diseases Segmentation in Complex Environments

Chaotic-to-Fine Clustering for Unlabeled Plant Disease Images

Tomato leaf disease recognition based on multi-task distillation learning

Design of a Crop Disease and Pest Identification System Based on Deep Learning

Unsupervised deep learning techniques for powdery mildew recognition based on multispectral imaging

A novel ensemble learning method for crop leaf disease recognition

An in-field automatic wheat disease diagnosis system

An Improved Convolutional Neural Network for Plant Disease Detection Using Unmanned Aerial Vehicle Images

A Two-Step Machine Learning Approach for Crop Disease Detection Using GAN and UAV Technology

Using Deep Learning for Image-Based Plant Disease Detection

Metric learning for image-based flower cultivars identification

HLNet Model and Application in Crop Leaf Diseases Identification

Critical Information Mining Network: Identifying Crop Diseases in Noisy Environments

Optimal Models for Plant Disease and Pest Detection Using UAV Image

From Laboratory to Field: Unsupervised Domain Adaptation for Plant Disease Recognition in the Wild