Abstract:Recent advances in deep learning and imaging technologies have revolutionized automated medical image analysis, especially in diagnosing Alzheimer's disease through neuroimaging. Despite the availability of various imaging modalities for the same patient, the development of multi-modal models leveraging these modalities remains underexplored. This paper addresses this gap by proposing and evaluating classification models using 2D and 3D MRI images and amyloid PET scans in uni-modal and multi-modal frameworks. Our findings demonstrate that models using volumetric data learn more effective representations than those using only 2D images. Furthermore, integrating multiple modalities enhances model performance over single-modality approaches significantly. We achieved state-of-the-art performance on the OASIS-3 cohort. Additionally, explainability analyses with Grad-CAM indicate that our model focuses on crucial AD-related regions for its predictions, underscoring its potential to aid in understanding the disease's causes.

What problem does this paper attempt to address?

The problem this paper attempts to address is the development of a multimodal approach that utilizes 3D MRI and amyloid PET scans to automatically detect Alzheimer's disease (AD). Although various imaging techniques are currently available for the same patient, there is still limited research on integrating these modalities into a multimodal model. This paper fills this gap by proposing and evaluating classification models using 2D and 3D MRI images and amyloid PET scans. Specifically, the paper aims to: 1. **Improve diagnostic accuracy**: Enhance the automatic detection performance of Alzheimer's disease by combining multiple imaging modalities. 2. **Explore the advantages of volumetric data**: Verify whether using 3D data can learn more effective feature representations compared to using only 2D images. 3. **Interpret model decisions**: Reveal the key brain regions the model focuses on during prediction through interpretability analysis methods such as Grad-CAM, thereby enhancing the model's transparency and credibility. The main contributions of the paper include: - Achieving state-of-the-art performance in the OASIS-3 cohort. - Demonstrating that the multimodal model significantly outperforms the unimodal model in diagnostic accuracy. - Confirming through interpretability analysis that the brain regions the model focuses on are consistent with pathological areas related to Alzheimer's disease.

Automated detection of Alzheimer's disease: a multi-modal approach with 3D MRI and amyloid PET