Local Reversible Transformer for semantic segmentation of grape leaf diseases

Xinxin Zhang,Fei Li,Haibin Jin,Weisong Mu
DOI: https://doi.org/10.1016/j.asoc.2023.110392
2023-01-01
SSRN Electronic Journal
Abstract:Grape leaf diseases segmentation is an essential basis for achieving precise diagnosis and identification of diseases. However, the complex background renders it difficult for small disease areas to be precisely segmented. The existing Transformer mainly focuses on utilizing key and value downsampling to improve model performance while neglecting that downsampling is irreversible with the loss of contextual information. To this end, this paper proposed a novel Locally Reversible Transformer (LRT) segmentation model for grape leaf diseases in natural scene images, whose representation is learned in a reversible downsampling manner. Specifically, a Local Learning Bottleneck (LLB) is developed to enhance local perception and extract richer semantic information of grape leaf diseases via inverted residual convolution. Furthermore, motivated by the wavelet theory, the Reversible Attention (RA) is designed to replace the original downsampling operation by introducing wavelet transform into the multi-headed attention and solving the problem of difficult detection and segmentation of small disease targets with complex backgrounds. Extensive experiments demonstrate that the segmentation performance of LRT outperforms state-of-the-art models with comparable GFLOPs and parameters. Moreover, LRT can retain more multi-grain information and can increase the receptive field to focus on small disease regions with complex backgrounds.
What problem does this paper attempt to address?