TCNet: tensor and covariance attention network for semantic segmentation

Haixia Xu,Yanbang Liu,Wei Wang,Wei Zhou,Fanxun Ding,Feng Han,Wei Peng
DOI: https://doi.org/10.1007/s00500-024-09638-7
IF: 3.732
2024-02-08
Soft Computing
Abstract:Non-local network provides a pioneering approach for capturing long-range dependency by aggregating query-specific global context into each query location; however, non-local network applies the identical weight to each channel of feature maps and ignores the differences from the different channels of features. We design a novel tensor attention module (TAM), which integrates the context information along spatial dimension and channel dimension by introducing a bias learnable parameters tensor, so that the feature at each location of each channel can aggregate the features from all other locations. Motivated by SE-Net, we propose a novel second-order covariance attention module (SCAM) to enhance the feature correlation between different channel maps through the second-order statistics and the local cross-channel interaction strategy. We take the encoder–decoder segmentation network DeepLabv3+ as baseline, and in the encoder develop the attention modules TAM and SCAM for semantic segmentation (TCNet). Experimental results on PASCAL VOC 2012 and Cityscapes datasets show that our proposed network has better performance than the other state-of-the-art segmentation networks.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?