Abstract:Multimodal magnetic resonance imaging (MRI) provides complementary information about targets, and the segmentation of multimodal MRI is widely used as an essential preprocessing step for initial diagnosis, stage differentiation, and post-treatment efficacy evaluation in clinical situations. For the main modality or each of the modalities, it is important to enhance the visual information by modeling the connection and effectively fusing the features among them. However, the existing methods for multimodal segmentation have a drawback; they coincidentally drop information of individual modality during the fusion process. Recently, graph learning-based methods have been applied in segmentation, and these methods have achieved considerable improvements by modeling the relationships across feature regions and reasoning using global information. In this paper, we propose a graph learning-based approach to efficiently extract modality-specific features and establish regional correspondence effectively among all modalities. In detail, after projecting features into a graph domain and employing graph convolution to propagate information across all regions for learning global modality-specific features, we propose a mutual information-based graph co-attention module to learn the weight coefficients of one bipartite graph constructed by the fully connected graphs having different modalities in the graph domain and by selectively fusing the node features. Based on the deformation diagram between the spatial-graph space and our proposed graph co-attention module, we present a multimodal prior-guided segmentation framework, which uses two strategies for two clinical situations: Modality-Specific Learning Strategy and Co-Modality Learning Strategy. Besides, the improved Co-Modality Learning Strategy is used with trainable weights in the multi-task loss for the optimization of the proposed framework. We validated our proposed modules and frameworks on two multimodal MRI datasets: our private liver lesion dataset and a public prostate zone dataset. Our experimental results on both datasets prove the superiority of our proposed approaches.

Learning Cross-Modal Aligned Representation with Graph Embedding

Learning Visually Aligned Semantic Graph for Cross-Modal Manifold Matching.

Graph Embedding Learning for Cross-Modal Information Retrieval.

Cross-modal Metric Learning with Graph Embedding.

Mutual Information-Based Graph Co-Attention Networks for Multimodal Prior-Guided Magnetic Resonance Imaging Segmentation

Cross-modal alignment with graph reasoning for image-text retrieval

Cross-graph Embedding with Trainable Proximity for Graph Alignment

Improving Supervised Cross-modal Retrieval with Semantic Graph Embedding

Deep Multi-Graph Hierarchical Enhanced Semantic Representation for Cross-Modal Retrieval

Cross-view Graph Contrastive Representation Learning on Partially Aligned Multi-view Data

Cross-Graph Attention Enhanced Multi-Modal Correlation Learning for Fine-Grained Image-Text Retrieval

Cross‐modal fusion encoder via graph neural network for referring image segmentation

Weighted Graph-structured Semantics Constraint Network for Cross-Modal Retrieval

Semantic Modeling of Textual Relationships in Cross-modal Retrieval

Learning Aligned Image-Text Representations Using Graph Attentive Relational Network

Hierarchical Cross-Modal Graph Consistency Learning for Video-Text Retrieval.

Bridging Multimedia Heterogeneity Gap Via Graph Representation Learning for Cross-Modal Retrieval.

Semi-supervised constrained graph convolutional network for cross-modal retrieval

Multicenter clinical trial of implanted norethindrone pellets for long-acting contraception in women. Program for Applied Research on Fertility Regulation.

Adversarial pre-optimized graph representation learning with double-order sampling for cross-modal retrieval

Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment