CoT-XNet: Contextual Transformer with Xception Network for diabetic retinopathy grading
Shuiqing Zhao,Yanan Wu,Mengmeng Tong,Yudong Yao,Wei Qian,Shouliang Qi
DOI: https://doi.org/10.1088/1361-6560/ac9fa0
IF: 3.5
2022-11-03
Physics in Medicine and Biology
Abstract:Objective: Diabetic retinopathy (DR) grading is primarily performed by assessing fundus images. Many types of lesions, such as microaneurysms, hemorrhages, and soft exudates, are available simultaneously in a single image. However, their sizes may be small, making it difficult to differentiate adjacent DR grades even using deep convolutional neural networks (CNNs). Recently, a vision transformer (ViT) has shown comparable or even superior performance to CNNs, and it also learns different visual representations from CNNs. Inspired by this finding, we propose a two-path Contextual Transformer with Xception Network (CoT-XNet) to improve the accuracy of DR grading. Approach: The representations learned by CoT through one path and those by the Xception network through another path are concatenated before the fully connected layer. Meanwhile, the dedicated pre-processing, data resampling, and test time augmentation strategies are implemented. The performance of CoT-XNet is evaluated in the publicly available datasets of DDR, APTOS2019, and EyePACS, which include over 50,000 images. Ablation experiments and comprehensive comparisons with various state-of-the-art (SOTA) models have also been performed. Main results: Our proposed CoT-XNet shows better performance than available SOTA models, and the accuracy and Kappa are 83.10% and 0.8496, 84.18% and 0.9000, and 84.10% and 0.7684, respectively, in the three datasets (listed above). Class activation maps of CoT and Xception networks are different and complementary in most images. Significance: By concatenating the different visual representations learned by CoT and Xception networks, CoT-XNet can accurately grade DR from fundus images and present good generalizability. CoT-XNet will promote the application of artificial intelligence-based systems in the DR screening of large-scale populations.
engineering, biomedical,radiology, nuclear medicine & medical imaging