MAIN - Multimodal Attention-based Fusion Networks for Diagnosis Prediction.

Ying An,Haojia Zhang,Yu Sheng,Jianxin Wang,Xianlai Chen
DOI: https://doi.org/10.1109/bibm52615.2021.9669634
2021-01-01
Abstract:Predicting the future diagnoses from patients’ historical Electronic Health Records (EHR) is a significant task in healthcare. EHR consist of multiple modal data, each modality has different features and contains a wealth of information of patients. However, most of the existing EHR-based prediction methods either only use unimodal data, or fail to fully explore the correlation between different modalities when fusing multimodal data. To address these challenges, we propose a Multimodal Attention-based fusIon Networks (MAIN) for diagnosis prediction. In this model, we first design different feature extraction modules for each modality. Then, an inter-modal correlation module which contains two layers is applied to capture the intermodal correlation. Finally, a multimodal fusion module based on weighted averaging is utilized to integrate the representations derived from different modalities and their correlation to obtain the patient representation for diagnosis prediction. We evaluate our proposed model on two medical datasets, and the experimental results demonstrate the effectiveness of MAIN.
What problem does this paper attempt to address?