MSAFusionNet: Multiple Subspace Attention Based Deep Multi-modal Fusion Network

Sen Zhang,Changzheng Zhang,Lanjun Wang,Cixing Li,Dandan Tu,Rui Luo,Guojun Qi,Jiebo Luo
DOI: https://doi.org/10.1007/978-3-030-32692-0_7
2019-01-01
Abstract:It is common for doctors to simultaneously consider multi-modal information in diagnosis. However, how to use multi-modal medical images effectively has not been fully studied in the field of deep learning within such a context. In this paper, we address the task of end-to-end segmentation based on multi-modal data and propose a novel deep learning framework, multiple subspace attention-based deep multi-modal fusion network (referred to as MSAFusionNet hereon-forth). More specifically, MSAFusionNet consists of three main components: (1) a multiple subspace attention model that contains inter-attention modules and generalized squeeze-and-excitation modules, (2) a multi-modal fusion network which leverages CNN-LSTM layers to integrate sequential multi-modal input images, and (3) a densely-dilated U-Net as the encoder-decoder backbone for image segmentation. Experiments on ISLES 2018 data set have shown that MSAFusionNet achieves the state-of-the-art segmentation accuracy.
What problem does this paper attempt to address?