Leveraging Multimodal Fusion for Enhanced Diagnosis of Multiple Retinal Diseases in Ultra-wide OCTA

Hao Wei,Peilun Shi,Guitao Bai,Minqing Zhang,Shuangle Li,Wu Yuan
2023-11-17
Abstract:Ultra-wide optical coherence tomography angiography (UW-OCTA) is an emerging imaging technique that offers significant advantages over traditional OCTA by providing an exceptionally wide scanning range of up to 24 x 20 $mm^{2}$, covering both the anterior and posterior regions of the retina. However, the currently accessible UW-OCTA datasets suffer from limited comprehensive hierarchical information and corresponding disease annotations. To address this limitation, we have curated the pioneering M3OCTA dataset, which is the first multimodal (i.e., multilayer), multi-disease, and widest field-of-view UW-OCTA dataset. Furthermore, the effective utilization of multi-layer ultra-wide ocular vasculature information from UW-OCTA remains underdeveloped. To tackle this challenge, we propose the first cross-modal fusion framework that leverages multi-modal information for diagnosing multiple diseases. Through extensive experiments conducted on our openly available M3OCTA dataset, we demonstrate the effectiveness and superior performance of our method, both in fixed and varying modalities settings. The construction of the M3OCTA dataset, the first multimodal OCTA dataset encompassing multiple diseases, aims to advance research in the ophthalmic image analysis community.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily aims to address the challenges in diagnosing various retinal diseases using Ultra-wide Optical Coherence Tomography Angiography (UW-OCTA). Specifically: 1. **Dataset Limitations**: The currently available UW-OCTA datasets lack comprehensive layered information and disease annotations. 2. **Insufficient Utilization of Multimodal Information**: Although UW-OCTA can provide multi-layer retinal vascular information, current methods have not fully utilized this information. To address these issues, the authors constructed the first multimodal, multi-disease, ultra-wide field UW-OCTA dataset—M3OCTA, and proposed a cross-modal fusion framework (CMF-Net) to effectively utilize multimodal information for the diagnosis of various retinal diseases. Additionally, this method supports dynamically inputting different numbers of modalities during the inference stage, thereby enhancing the flexibility and compatibility of clinical applications. Extensive experimental validation demonstrates that this method exhibits superior performance and maintains stable performance under different input modalities.