UNet based on dynamic convolution decomposition and triplet attention

Yang Li,Bobo Yan,Jianxin Hou,Bingyang Bai,Xiaoyu Huang,Canfei Xu,Limei Fang
DOI: https://doi.org/10.1038/s41598-023-50989-2
IF: 4.6
2024-01-02
Scientific Reports
Abstract:Abstract The robustness and generalization of medical image segmentation models are being challenged by the differences between different disease types, different image types, and different cases.Deep learning based semantic segmentation methods have been providing state-of-the-art performance in the last few years. One deep learning technique, U-Net, has become the most popular architecture in the medical imaging segmentation. Despite outstanding overall performance in segmenting medical images, it still has the problems of limited feature expression ability and inaccurate segmentation. To this end, we propose a DTA-UNet based on Dynamic Convolution Decomposition (DCD) and Triple Attention (TA). Firstly, the model with Attention U-Net as the baseline network uses DCD to replace all the conventional convolution in the encoding-decoding process to enhance its feature extraction capability. Secondly, we combine TA with Attention Gate (AG) to be used for skip connection in order to highlight lesion regions by removing redundant information in both spatial and channel dimensions. The proposed model are tested on the two public datasets and actual clinical dataset such as the public COVID-SemiSeg dataset, the ISIC 2018 dataset, and the cooperative hospital stroke segmentation dataset. Ablation experiments on the clinical stroke segmentation dataset show the effectiveness of DCD and TA with only a 0.7628 M increase in the number of parameters compared to the baseline model. The proposed DTA-UNet is further evaluated on the three datasets of different types of images to verify its universality. Extensive experimental results show superior performance on different segmentation metrics compared to eight state-of-art methods.The GitHub URL of our code is https://github.com/shuaihou1234/DTA-UNet .
multidisciplinary sciences
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the challenges of robustness and generalization in medical image segmentation models caused by variations in different disease types, different image types, and different cases. Despite the excellent performance of deep learning-based semantic segmentation methods in recent years, particularly the popularity of the U-Net architecture in medical image segmentation, these models still face issues of limited feature representation capability and inaccurate segmentation. To tackle these problems, the authors propose an improved U-Net model based on Dynamic Convolution Decomposition (DCD) and Triple Attention Mechanism (TA), called DTA-UNet. Specifically, the main contributions of this model include: 1. **Introduction of DCD**: Replacing all conventional convolutions with DCD significantly enhances the feature extraction capability of CNNs while adding only a few parameters. 2. **Combination of TA and Attention Gate (AG)**: Using TA and AG in skip connections to highlight lesion areas and optimize the features extracted by dynamic convolution. 3. **Universal Segmentation Model**: DTA-UNet is a universal segmentation model that is friendly to lesion areas and boundaries, showing accurate segmentation results on three different types of datasets. With these improvements, DTA-UNet performs better when dealing with lesions with significant differences in shape, size, or location, and is more sensitive to boundary blurring issues.