Improving 3D Medical Image Segmentation at Boundary Regions using Local Self-attention and Global Volume Mixing

Daniya Najiha Abdul Kareem,Mustansar Fiaz,Noa Novershtern,Jacob Hanna,Hisham Cholakkal

2024-10-20

Abstract:Volumetric medical image segmentation is a fundamental problem in medical image analysis where the objective is to accurately classify a given 3D volumetric medical image with voxel-level precision. In this work, we propose a novel hierarchical encoder-decoder-based framework that strives to explicitly capture the local and global dependencies for volumetric 3D medical image segmentation. The proposed framework exploits local volume-based self-attention to encode the local dependencies at high resolution and introduces a novel volumetric MLP-mixer to capture the global dependencies at low-resolution feature representations, respectively. The proposed volumetric MLP-mixer learns better associations among volumetric feature representations. These explicit local and global feature representations contribute to better learning of the shape-boundary characteristics of the organs. Extensive experiments on three different datasets reveal that the proposed method achieves favorable performance compared to state-of-the-art approaches. On the challenging Synapse Multi-organ dataset, the proposed method achieves an absolute 3.82\% gain over the state-of-the-art approaches in terms of HD95 evaluation metrics {while a similar improvement pattern is exhibited in MSD Liver and Pancreas tumor datasets}. We also provide a detailed comparison between recent architectural design choices in the 2D computer vision literature by adapting them for the problem of 3D medical image segmentation. Finally, our experiments on the ZebraFish 3D cell membrane dataset having limited training data demonstrate the superior transfer learning capabilities of the proposed vMixer model on the challenging 3D cell instance segmentation task, where accurate boundary prediction plays a vital role in distinguishing individual cell instances.

Image and Video Processing,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the issue of segmentation accuracy in 3D medical image segmentation, particularly in the organ boundary regions. Existing methods, such as those based on Convolutional Neural Networks (CNN) and Transformer-based methods, have limitations in capturing long-range feature dependencies. This is especially challenging in multi-organ segmentation tasks due to the diversity in organ shapes and scales, making it difficult for these methods to accurately capture complex tissue boundaries. Additionally, the standard self-attention mechanism has quadratic complexity when handling a large number of tokens, which limits its application to large-scale 3D data. To address these issues, the paper proposes a new hierarchical encoder-decoder framework called vMixer. This framework aims to explicitly capture local dependencies at high resolution by introducing Local Volume Self-Attention (LVSA) blocks and to capture global dependencies in low-resolution feature representations by introducing a novel Global Volume Mixer (GVM) block. This design helps to better learn the shape boundary characteristics of organs, thereby improving segmentation accuracy, particularly in the prediction of boundary regions. The paper validates the effectiveness of the proposed method through experiments on multiple datasets, demonstrating superior performance compared to existing methods.

Improving 3D Medical Image Segmentation at Boundary Regions using Local Self-attention and Global Volume Mixing

3D Multiple-Contextual ROI-Attention Network for Efficient and Accurate Volumetric Medical Image Segmentation.

Adaptive Decomposition and Shared Weight Volumetric Transformer Blocks for Efficient Patch-Free 3D Medical Image Segmentation.

MixFormer: a Mixed CNN-Transformer Backbone for Medical Image Segmentation

Volumetric memory network for interactive medical image segmentation

MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation

Medical Image Segmentation Using Directional Window Attention

One Network to Segment Them All: A General, Lightweight System for Accurate 3D Medical Image Segmentation

Bridging 2D and 3D Segmentation Networks for Computation Efficient Volumetric Medical Image Segmentation: An Empirical Study of 2.5D Solutions

Deep Sequential Segmentation of Organs in Volumetric Medical Scans

Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences

A New Ensemble Learning Framework for 3D Biomedical Image Segmentation

Accelerating 3D Medical Image Segmentation by Adaptive Small-Scale Target Localization

D2-MLP: Dynamic Decomposed MLP Mixer for Medical Image Segmentation

SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation

SegVol: Universal and Interactive Volumetric Medical Image Segmentation

Sli2Vol+: Segmenting 3D Medical Images Based on an Object Estimation Guided Correspondence Flow Network

Volumetric Attention for 3D Medical Image Segmentation and Detection

V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset

A Dimension Hybrid Framework for Multimodal Medical Image Segmentation