CAT-Unet: An Enhanced U-Net Architecture with Coordinate Attention and Skip-Neighborhood Attention Transformer for Medical Image Segmentation
Zhiquan Ding,Yuejin Zhang,Chenxin Zhu,Guolong Zhang,Xiong Li,Nan Jiang,Yue Que,Yuanyuan Peng,Xiaohui Guan
DOI: https://doi.org/10.1016/j.ins.2024.120578
IF: 8.1
2024-04-11
Information Sciences
Abstract:With the rise of deep learning, the U-Net network, based on a U-shaped architecture and skip connections, has found widespread application in various medical image segmentation tasks. However, the receptive field of the standard convolution operation is limited, because it is difficult to achieve global and long-distance semantic information interaction. Inspired by the advantages of ConvNext and Neighborhood Attention (NA), we propose CAT-Unet in this study to address the aforementioned challenges. We effectively reduce the number of parameters by utilizing large kernels and depthwise separable convolutions. Meanwhile, we introduce a Coordinate Attention (CA) module, which enables the model to learn more comprehensive and contextual information from surrounding regions. Furthermore, we introduce Skip-NAT (Neighborhood Attention Transformer) as the main algorithmic framework, replacing U-Net's original skip-connection layers, to lessen the impact of shallow features on network efficiency. Experimental results show that CAT-Unet achieves better segmentation results. On the ISIC2018 dataset, the best results for Dice(Dice Coefficient), IoU(Intersection over Union), and HD(Hausdorff Distance) are 90.26%, 83.58%, and 4.259, respectively. For the PH2 dataset, the best Dice, IoU, and HD results are 96.49%, 91.81%, and 3.971, respectively. Finally, on the DSB2018 dataset, the best Dice, IoU, and HD results are 94.58%, 88.78%, and 3.749, respectively.
computer science, information systems