Recognition of Diabetic Retinopathy Grades Based on Data Augmentation and Attention Mechanisms

Xueri Li,Li Wen,Fanyu Du,Lei Yang,Jianfang Wu
DOI: https://doi.org/10.1002/ima.23201
IF: 2.177
2024-10-24
International Journal of Imaging Systems and Technology
Abstract:ABSTRACT Diabetic retinopathy is a complication of diabetes and one of the leading causes of vision loss. Early detection and treatment are essential to prevent vision loss. Deep learning has been making great strides in the field of medical image processing and can be used as an aid for medical practitioners. However, unbalanced datasets, sparse focal areas, small differences between adjacent disease grades, and varied manifestations of the same grade disease challenge deep learning model training. Generalization performance and robustness are inadequate. To address the problem of unbalanced sample numbers between classes in the dataset, this work proposes using VQ‐VAE for reconstructing affine transformed images to enrich and balance the dataset. Test results show the model's average reconstruction error is 0.0001, and the mean structural similarity between reconstructed and original images is 0.967. This proves reconstructed images differ from originals yet belong to the same category, expanding and diversifying the dataset. Addressing the issues of focal area sparsity and disease grade disparity, this work utilizes ResNeXt50 as the backbone network and constructs diverse attention networks by modifying the network structure and embedding different attention modules. Experiments demonstrate that the convolutional attention network outperforms ResNeXt50 in terms of Precision, Sensitivity, Specificity, F1 Score, Quadratic Weighted Kappa Coefficient, Accuracy, and robustness against Salt and Pepper noise, Gaussian noise, and gradient perturbation. Finally, the heat maps of each model recognizing the fundus image were plotted using the Grad‐CAM method. The heat maps show that the attentional network is more effective than the non‐attentional network ResNeXt50 at attending to the fundus image.
engineering, electrical & electronic,optics,imaging science & photographic technology
What problem does this paper attempt to address?