LogTrans: Providing Efficient Local-Global Fusion with Transformer and CNN Parallel Network for Biomedical Image Segmentation

Zhiqiang Li,Xingqing Nie,T. Tong,Xiaogen Zhou,Luoyan Wang,Xing Lin
DOI: https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00128
2022-12-01
Abstract:Accurate biomedical image segmentation is a prerequisite for excellent computer-aided diagnosis (CAD) systems. A series of researches have shown that convolutional neural networks (CNNs) have made impressive progress in segmentation tasks. Nevertheless, owing to the finite receptive field of CNN-based algorithms, such networks are focused too much on local area features rather than global context. While the Transformer architecture can encode global dependencies information through the self-attention mechanism, this mechanism typically ignores the local pixel-level structural information within each divided patch. Therefore, a better solution is still needed for how to integrate CNN architecture with Transformer architecture efficiently. In this essay, we propose an originative parallel segmentation algorithm called LogTrans. First, in the encoder path, the local details and global contour dependencies on the entire image are captured by the CNN branch and the Transformer branch, respectively. Then these two branches complement each other by a novel separate-combiner (SeCo) module, leading to better fused features. Moreover, we attempt to further enhance the segmentation properties by using a residual stackable dilated (ReS$D$) block, which applies residual shortcut connections to resolve dimension alterations in the target region and stacks dilated convolutions to capture more spatial information. The proposed LogTrans framework was evaluated on two biomedical datasets, including ISIC-2017 and UITNS-2022 datasets. Collectively, multiple results have indicated that our LogTrans performs superior with other state-of-the-art architectures in both visual comparison and quantitative appraisal.
Medicine,Computer Science
What problem does this paper attempt to address?