Multimodal Explainability via Latent Shift applied to COVID-19 stratification

Valerio Guarrasi,Lorenzo Tronchin,Domenico Albano,Eliodoro Faiella,Deborah Fazzini,Domiziana Santucci,Paolo Soda
2024-07-22
Abstract:We are witnessing a widespread adoption of artificial intelligence in healthcare. However, most of the advancements in deep learning in this area consider only unimodal data, neglecting other modalities. Their multimodal interpretation necessary for supporting diagnosis, prognosis and treatment decisions. In this work we present a deep architecture, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. The explanation of the decision taken is computed by applying a latent shift that, simulates a counterfactual prediction revealing the features of each modality that contribute the most to the decision and a quantitative score indicating the modality importance. We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset, which contains multimodal data for the early identification of patients at risk of severe outcome. The results show that the proposed method provides meaningful explanations without degrading the classification performance.
Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the following issues: 1. **Application of Multimodal Data in the Medical Field**: Currently, most deep learning models in the medical field consider only single-modal data, ignoring other available information sources. However, medical diagnosis is inherently multimodal, requiring AI methods capable of handling different modalities of data. 2. **Explainable Artificial Intelligence (XAI)**: Although complex AI models have achieved significant results in many fields, they are often black-box operations, lacking transparency and trustworthiness. Especially in the biomedical field, the interpretability of models is crucial. Therefore, researchers are committed to developing models that can explain their decision-making processes. 3. **Application of Multimodal Explanations in Medicine**: Multimodal models extract more comprehensive information than single-modal models, so their explanations can provide more insights into medical data. Nevertheless, there is currently a lack of interpretable multimodal deep learning models in the biomedical literature. Specifically, the paper proposes a new end-to-end multimodal architecture that combines tabular data and image data, achieving interpretability through joint learning of modality reconstruction and sample classification. This method reveals the contribution of each modality to the decision-making process by simulating counterfactual predictions and provides quantitative scores representing the importance of each modality. Researchers validated this method on the AIforCOVID dataset for the early identification of COVID-19 patients at risk of severe outcomes, showing that the method can provide meaningful explanations without reducing classification performance.