ViT-DR: Vision Transformers in Diabetic Retinopathy Grading Using Fundus Images

Tripti Goel,Parthapratim Roy,N. Mohan,R. Murugan
DOI: https://doi.org/10.1109/R10-HTC54060.2022.9930027
2022-09-16
Abstract:Diabetic retinopathy (DR) is a complication of diabetes caused by blood leakage into the retinal tissues. DR has become one of the primary causes of visual impairment worldwide, and it is the leading risk factor for poor vision in patients aged 25 to 74. Many symptoms cause retinal degener-ation, resulting in vision loss. Early DR detection helps avoid vision loss. Manual DR grading using fundus images is time-consuming and requires expert ophthalmologists. Therefore, we propose Vision Transformer (ViT) based DR severity classifi-cation method. In this work, the fundus images are initially divided into non-overlapping patches to retain location information. Then, the flattened patches are converted into sequences before going through a linear and positional embedding process. The generated sequence is then fed into several multi-head attention layers, which produce the final representation. The first token sequence is fed into a softmax classification layer in the classification stage, which produces the recognition output. The proposed work is tested on Kaggle, IDRiD databases. The results obtained are better than the convolutional neural networks and state-of-the-art.
Medicine
What problem does this paper attempt to address?