PIS-Net: Efficient Medical Image Segmentation Network with Multivariate Downsampling for Point-of-Care

Changrui Zhang,Jia Wang
DOI: https://doi.org/10.3390/e26040284
IF: 2.738
2024-03-27
Entropy
Abstract:Recently, with more portable diagnostic devices being moved to people anywhere, point-of-care (PoC) imaging has become more convenient and more popular than the traditional "bed imaging". Instant image segmentation, as an important technology of computer vision, is receiving more and more attention in PoC diagnosis. However, the image distortion caused by image preprocessing and the low resolution of medical images extracted by PoC devices are urgent problems that need to be solved. Moreover, more efficient feature representation is necessary in the design of instant image segmentation. In this paper, a new feature representation considering the relationships among local features with minimal parameters and a lower computational complexity is proposed. Since a feature window sliding along a diagonal can capture more pluralistic features, a Diagonal-Axial Multi-Layer Perceptron is designed to obtain the global correlation among local features for a more comprehensive feature representation. Additionally, a new multi-scale feature fusion is proposed to integrate nonlinear features with linear ones to obtain a more precise feature representation. Richer features are figured out. In order to improve the generalization of the models, a dynamic residual spatial pyramid pooling based on various receptive fields is constructed according to different sizes of images, which alleviates the influence of image distortion. The experimental results show that the proposed strategy has better performance on instant image segmentation. Notably, it yields an average improvement of 1.31% in Dice than existing strategies on the BUSI, ISIC2018 and MoNuSeg datasets.
physics, multidisciplinary
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on several key challenges faced in real - time medical image segmentation: 1. **Image distortion caused by image pre - processing**: In the Point - of - Care (PoC) scenario, since medical images generated by different portable devices vary in size, these images need to go through pre - processing steps such as resizing, normalization, and data augmentation to unify the size. However, these pre - processing steps will inevitably lead to image distortion, thus affecting the clarity of the image edges and the contrast between the foreground and the background, and reducing the effectiveness of the semantic segmentation task. 2. **Difficulty in feature extraction of low - resolution images**: Medical images extracted by PoC devices are usually of low resolution, which makes it very difficult to extract complex and diverse features from these images. Therefore, designing a method that can represent features efficiently is crucial for real - time image segmentation. 3. **Insufficient feature representation ability of existing models**: Existing hybrid models (such as MLP - CNN hybrid models) are too single in feature representation and difficult to learn sufficient feature representations, especially when dealing with complex images. In addition, these models also have limitations in dealing with image distortion. To address the above challenges, the author proposes a new medical image segmentation network (PIS - Net), which improves the effect of real - time image segmentation through the following innovative methods: - **Diagonal - Axial MLP Block**: By designing parallel sliding windows in the diagonal direction, the feature interaction between local windows is enhanced, thereby realizing the modeling of long - distance dependencies across the whole image, ensuring good local feature extraction and obtaining more abundant semantic features. - **Hybrid Downsampling Strategy (HBDS)**: A hybrid downsampling branch is added in the MLP stage to fuse linear features with nonlinear features to achieve multi - scale feature fusion and improve the generalization ability of the model. - **Dynamic Residual Spatial Pyramid Pooling (DR - SPP)**: By adaptively selecting the size of the pooling window and adding parallel convolutional residual branches, the pooling process is made more stable, the weight fluctuation is reduced, and thus the impact of image distortion on the model performance is alleviated. Through these innovations, the experimental results of PIS - Net on multiple medical image segmentation datasets show that it has better performance in real - time image segmentation tasks, especially with an average improvement of 1.31% in the Dice coefficient.