M2LC-Net: A Multi-Modal Multi-Disease Long-Tailed Classification Network for Real Clinical Scenes

Zhonghong Ou,Wenjun Chai,Lifei Wang,Ruru Zhang,Jiawen He,Meina Song,Lifei Yuan,Shengjuan Zhang,Yanhui Wang,Huan Li,Xin Jia,Rujian Huang
DOI: https://doi.org/10.23919/jcc.2021.09.016
2021-01-01
China Communications
Abstract:Leveraging deep learning-based techniques to classify diseases has attracted extensive research interest in recent years. Nevertheless, most of the current studies only consider single-modal medical images, and the number of ophthalmic diseases that can be classified is relatively small. Moreover, imbalanced data distribution of different ophthalmic diseases is not taken into consideration, which limits the application of deep learning techniques in realistic clinical scenes. In this paper, we propose a Multimodal Multi-disease Long-tailed Classification Network (M2LC-Net) in response to the challenges mentioned above. M2LC-Net leverages ResNet18-CBAM to extract features from fundus images and Optical Coherence Tomography (OCT) images, respectively, and conduct feature fusion to classify 11 common ophthalmic diseases. Moreover, Class Activation Mapping (CAM) is employed to visualize each mode to improve interpretability of M2LC-Net. We conduct comprehensive experiments on realistic dataset collected from a Grade III Level A ophthalmology hospital in China, including 34,396 images of 11 disease labels. Experimental results demonstrate effectiveness of our proposed model M2LC-Net. Compared with the state-of-the-art, various performance metrics have been improved significantly. Specifically, Cohen's kappa coefficient κ has been improved by 3.21%, which is a remarkable improvement.
What problem does this paper attempt to address?