Abstract:Background and objective: [18f]-fluorodeoxyglucose (fdg) positron emission tomography – computed tomography (pet-ct) is now the preferred imaging modality for staging many cancers. Pet images characterize tumoral glucose metabolism while ct depicts the complementary anatomical localization of the tumor. Automatic tumor segmentation is an important step in image analysis in computer aided diagnosis systems. Recently, fully convolutional networks (fcns), with their ability to leverage annotated datasets and extract image feature representations, have become the state-of-the-art in tumor segmentation. There are limited fcn based methods that support multi-modality images and current methods have primarily focused on the fusion of multi-modality image features at various stages, i.e., early-fusion where the multi-modality image features are fused prior to fcn, late-fusion with the resultant features fused and hyper-fusion where multi-modality image features are fused across multiple image feature scales. Early- and late-fusion methods, however, have inherent, limited freedom to fuse complementary multi-modality image features. The hyper-fusion methods learn different image features across different image feature scales that can result in inaccurate segmentations, in particular, in situations where the tumors have heterogeneous textures. Methods: we propose a recurrent fusion network (rfn), which consists of multiple recurrent fusion phases to progressively fuse the complementary multi-modality image features with intermediary segmentation results derived at individual recurrent fusion phases: (1) the recurrent fusion phases iteratively learn the image features and then refine the subsequent segmentation results; and, (2) the intermediary segmentation results allows our method to focus on learning the multi-modality image features around these intermediary segmentation results, which minimize the risk of inconsistent feature learning. Results: we evaluated our method on two pathologically proven non-small cell lung cancer pet-ct datasets. We compared our method to the commonly used fusion methods (early-fusion, late-fusion and hyper-fusion) and the state-of-the-art pet-ct tumor segmentation methods on various network backbones (resnet, densenet and 3d-unet). Our results show that the rfn provides more accurate segmentation compared to the existing methods and is generalizable to different datasets. Conclusions: we show that learning through multiple recurrent fusion phases allows the iterative re-use of multi-modality image features that refines tumor segmentation results. We also identify that our rfn produces consistent segmentation results across different network architectures.

Design and Validate a Dual-Modality Characteristic Information Fusion System Based on Probabilistic Graphical Models

Mutual Information-Based Graph Co-Attention Networks for Multimodal Prior-Guided Magnetic Resonance Imaging Segmentation

Multi-modal Evidential Fusion Network for Trusted PET/CT Tumor Segmentation

Bilateral Cross-Modal Fusion Network for Multimodal Whole-Body Tumor Segmentation

Joint segmentation of tumors in 3D PET-CT images with a network fusing multi-view and multi-modal information

Deep Learning for Variational Multimodality Tumor Segmentation in PET/CT

Cross Modality Fusion for Modality-Specific Lung Tumor Segmentation in PET-CT Images.

Recurrent Feature Fusion Learning for Multi-Modality Pet-Ct Tumor Segmentation

Tumor co-segmentation in PET/CT using multi-modality fully convolutional neural network

A Spatial Squeeze and Multimodal Feature Fusion Attention Network for Multiple Tumor Segmentation from PET–CT Volumes

Dual-modality 3D brain PET-CT image segmentation based on probabilistic brain atlas and classification fusion

3D Lymphoma Segmentation on PET/CT Images Via Multi-Scale Information Fusion with Cross-Attention

Learning Feature Fusion Via an Interpretation Method for Tumor Segmentation on PET/CT

MFCNet: A multi-modal fusion and calibration networks for 3D pancreas tumor segmentation on PET-CT images

LGMSU-Net: Local Features, Global Features, and Multi-Scale Features Fused the U-Shaped Network for Brain Tumor Segmentation

Deep PET/CT fusion with Dempster-Shafer theory for lymphoma segmentation

Multi -Modality Brain Tumor Segmentation Network Based on Collaborative Feature Fusion

Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation.

Dual-modality Brain PET-CT Image Segmentation Based on Adaptive Use of Functional and Anatomical Information

High-Quality Fusion and Visualization for MR-PET Brain Tumor Images via Multi-Dimensional Features

MFU-Net: a deep multimodal fusion network for breast cancer segmentation with dual-layer spectral detector CT