Abstract:Background and objective: [18f]-fluorodeoxyglucose (fdg) positron emission tomography – computed tomography (pet-ct) is now the preferred imaging modality for staging many cancers. Pet images characterize tumoral glucose metabolism while ct depicts the complementary anatomical localization of the tumor. Automatic tumor segmentation is an important step in image analysis in computer aided diagnosis systems. Recently, fully convolutional networks (fcns), with their ability to leverage annotated datasets and extract image feature representations, have become the state-of-the-art in tumor segmentation. There are limited fcn based methods that support multi-modality images and current methods have primarily focused on the fusion of multi-modality image features at various stages, i.e., early-fusion where the multi-modality image features are fused prior to fcn, late-fusion with the resultant features fused and hyper-fusion where multi-modality image features are fused across multiple image feature scales. Early- and late-fusion methods, however, have inherent, limited freedom to fuse complementary multi-modality image features. The hyper-fusion methods learn different image features across different image feature scales that can result in inaccurate segmentations, in particular, in situations where the tumors have heterogeneous textures. Methods: we propose a recurrent fusion network (rfn), which consists of multiple recurrent fusion phases to progressively fuse the complementary multi-modality image features with intermediary segmentation results derived at individual recurrent fusion phases: (1) the recurrent fusion phases iteratively learn the image features and then refine the subsequent segmentation results; and, (2) the intermediary segmentation results allows our method to focus on learning the multi-modality image features around these intermediary segmentation results, which minimize the risk of inconsistent feature learning. Results: we evaluated our method on two pathologically proven non-small cell lung cancer pet-ct datasets. We compared our method to the commonly used fusion methods (early-fusion, late-fusion and hyper-fusion) and the state-of-the-art pet-ct tumor segmentation methods on various network backbones (resnet, densenet and 3d-unet). Our results show that the rfn provides more accurate segmentation compared to the existing methods and is generalizable to different datasets. Conclusions: we show that learning through multiple recurrent fusion phases allows the iterative re-use of multi-modality image features that refines tumor segmentation results. We also identify that our rfn produces consistent segmentation results across different network architectures.

Bilateral Cross-Modal Fusion Network for Multimodal Whole-Body Tumor Segmentation

A Spatial Squeeze and Multimodal Feature Fusion Attention Network for Multiple Tumor Segmentation from PET–CT Volumes

Modality-level cross-connection and attentional feature fusion based deep neural network for multi-modal brain tumor segmentation

MFCNet: A multi-modal fusion and calibration networks for 3D pancreas tumor segmentation on PET-CT images

Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation.

Feature fusion and latent feature learning guided brain tumor segmentation and missing modality recovery network

Brain Tumor Segmentation in Multimodal MRI Via Pixel-Level and Feature-Level Image Fusion.

A multi-modality fusion network based on attention mechanism for brain tumor segmentation

Multi-modal Evidential Fusion Network for Trusted PET/CT Tumor Segmentation

Recurrent Feature Fusion Learning for Multi-Modality Pet-Ct Tumor Segmentation

CT-Less Whole-Body Bone Segmentation of PET Images Using a Multimodal Deep Learning Network

Learning intra-inter-modality complementary for brain tumor segmentation

Cross Modality Fusion for Modality-Specific Lung Tumor Segmentation in PET-CT Images.

Multi -Modality Brain Tumor Segmentation Network Based on Collaborative Feature Fusion

CMAF-Net: a cross-modal attention fusion-based deep neural network for incomplete multi-modal brain tumor segmentation

MM-BiFPN: Multi-Modality Fusion Network With Bi-FPN for MRI Brain Tumor Segmentation

MFU-Net: a deep multimodal fusion network for breast cancer segmentation with dual-layer spectral detector CT

Multi-modal Brain Tumor Segmentation via Missing Modality Synthesis and Modality-level Attention Fusion

Effective Multipath Feature Extraction 3D CNN for Multimodal Brain Tumor Segmentation

Adaptive Cross-Feature Fusion Network with Inconsistency Guidance for Multi-Modal Brain Tumor Segmentation.

Tumor co-segmentation in PET/CT using multi-modality fully convolutional neural network