EACT-Det: an Efficient Adjusting Criss-cross Windows Transformer Embedding Pyramid Networks for Similar Disease Detection

Fenmei Wang,Rujing Wang,Ziliang Huang,Shifeng Dong,Xiuzhen Wang,Qiong Zhou,Shijian Zheng,Liu
DOI: https://doi.org/10.1007/s11042-023-17360-z
IF: 2.577
2023-01-01
Multimedia Tools and Applications
Abstract:The difficulty of crop disease detection lies in the variation in light, background, similar symptoms and the serious problem of multiple diseases overlapping due to shading, etc., posing a great challenge to disease detection. These problems make the convolutional kernel use the sliding window method for feature extraction, which lacks the ability to obtain global background information. To address this problem, we first developed an efficient, adjusting self-attentive mechanism with vertical and horizontal windows. Then, within the self-attentive mechanism, an efficient interactive spatial module was designed in our multi-headed attention mechanism to obtain a more comprehensive association of the target global contextual information of crop diseases. Next, an efficient Criss-cross window transformer module is formed which can solve the problem of large background area and highly similar disease characteristics of crop diseases. Finally, a global relationship aggregation module was designed to integrate the global dependencies of efficient cross-window transformers into the features of the pyramidal network, which can more effectively solve the problem of severe overlapping crop disease detection. This model is stated as the efficient adjusted Criss-cross window transformer (EACT-Det) for crop. Experiments on a dataset containing 21 diseases from four different crops (rice, wheat, maize and oilseed rape) show that our method improves the accuracy by 1.9% and reduces the maximum size parameter of the model by 5.96 M compared to Swin-T, a classic and excellent model in the Transformer structure. The performance of our Adjusting Criss-Cross Window Transformer Network has improved significantly. In addition, the dataset of similar modalities in this article provides a data foundation for future disease similarity detection, and is also an important innovation of this article.
What problem does this paper attempt to address?