ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation

Ruohua Shi,Qiufan Pang,Lei Ma,Lingyu Duan,Tiejun Huang,Tingting Jiang
2024-08-26
Abstract:Electron microscopy (EM) imaging offers unparalleled resolution for analyzing neural tissues, crucial for uncovering the intricacies of synaptic connections and neural processes fundamental to understanding behavioral mechanisms. Recently, the foundation models have demonstrated impressive performance across numerous natural and medical image segmentation tasks. However, applying these foundation models to EM segmentation faces significant challenges due to domain disparities. This paper presents ShapeMamba-EM, a specialized fine-tuning method for 3D EM segmentation, which employs adapters for long-range dependency modeling and an encoder for local shape description within the original foundation model. This approach effectively addresses the unique volumetric and morphological complexities of EM data. Tested over a wide range of EM images, covering five segmentation tasks and 10 datasets, ShapeMamba-EM outperforms existing methods, establishing a new standard in EM image segmentation and enhancing the understanding of neural tissue architecture.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the unique challenges encountered in electron microscope (EM) image segmentation. Specifically, although the base models perform well in natural and medical image segmentation tasks, when applied to 3D EM image segmentation, they face significant domain - difference problems. These problems include: 1. **High Resolution and Noise**: EM images have a much higher resolution (about 105 times) than ordinary medical images, which results in more noise. 2. **Consistency of Local Morphological Features**: The segmented objects in EM images have relatively consistent local features and are widely distributed in the spatial domain. 3. **Inherent Anisotropy**: This property of EM images makes the segmentation task more complex. To solve these problems, the paper proposes a special fine - tuning method named ShapeMamba - EM, which is based on the 3D medical base model and is improved in the following ways: - **Introducing Adapters to Model Long - Range Dependencies**: Use 3D Mamba adapters to enhance the capture of long - range dependencies. - **Encoding Local Shape Descriptors (LSDs)**: Predict LSDs through the 3D U - Net architecture to improve the accuracy of boundary prediction. These improvements make ShapeMamba - EM perform well in multiple EM image segmentation tasks, surpassing the existing methods, thus setting a new standard for EM image segmentation and enhancing the understanding of neural tissue structures. ### Formula Summary The main formulas involved in the paper are as follows: 1. **Definition of Local Shape Descriptors (LSDs)**: \[ S_v=\{v'\in\Omega|y(v) = y(v'), \|v - v'\|_2^2\leq\sigma\} \] \[ lsd_y(v)=(s(S_v), m(S_v)-v, c(S_v)) \] where: - \(s(S_v) = |S_v|\) represents the size of the set \(S_v\) - \(m(S_v)\) is the coordinate covariance - \(c(S_v)\) is the mean of the coordinates 2. **Calculation of Mean and Covariance**: \[ m(S_v)=\frac{1}{s(S_v)}\sum_{v\in S_v}v \] \[ c(S_v)=\frac{1}{s(S_v)}\sum_{v\in S_v}(v - m(S_v))(v - m(S_v))^T \] These formulas are used to define and calculate local shape descriptors, helping the model better understand the local morphological features in the image.