RustQNet: Multimodal Deep Learning for Quantitative Inversion of Wheat Stripe Rust Disease Index

Jie Deng,Danfeng Hong,Chenyu Li,Jing Yao,Ziqian Yang,Zhijian Zhang,Jocelyn Chanussot
DOI: https://doi.org/10.1016/j.compag.2024.109245
IF: 8.3
2024-01-01
Computers and Electronics in Agriculture
Abstract:Quantitative remote sensing of crop diseases at the field or plot scale is essential for crop management. Conventional approaches frequently rely solely on single-modal remote sensing data, resulting in performance limitations. Investigating the utilization of multi-modal imagery, including high spatial resolution Red-Green- Blue (RGB) images, high spectral resolution multispectral (MS) images, and vegetation indexes (VIs) informed by expert knowledge, to enhance quantitative inversion warrants further research. In this study, we propose a novel DL-based quantitative framework for analyzing multimodal remote sensing images, named RustQNet. This framework facilitates precise and high-throughput quantitative assessment of wheat stripe rust (WSR) disease index (DI). RustQNet is designed to handle multimodal remote sensing images, such as RGB, MS, and VIs . To improve the fusion of different modalities, RustQNet incorporates a mutual information minimization (MIM) module, which encourages information complementarity across modalities in a more compact fashion. To better train our model, we construct a benchmark dataset of time-series multimodal UAV images covering the entire period of the WSR epidemic, from the initial infection to the severe outbreaks. This dataset contains over 180 million quantitatively annotated pixels, making it the most comprehensive dataset currently available for the quantitative inversion of WSR. The results show that the R 2 value of the RGB+MS+VI three-modal model is 0.8024, which is improved 17.65%-35.59% compared with the single-modality model, and 1.27%-6.67% compared with the bimodal model. The RMSE of the three-modal model is 11.4776, which is 20.51%-30.71% lower than the single-modality model and 3.93%-11.56% lower than the bimodal model. When DI<=10, the RMSE is 5.47, indicating potential for early detection. In comparison to typical semantic segmentation models and sophisticated multimodal algorithms, the RustQNet exhibited exceptional performance. The findings suggest that RustQNet may serve as a multimodal data processing tools for precise and efficient crop disease quantitative inversion, thereby offering valuable insights for the high-throughput quantitative inversion of additional phenotypic traits, including yield and plant height.
What problem does this paper attempt to address?