MA-ResNet50: A General Encoder Network for Video Segmentation.

Xiaotian Liu,Lei Yang,Xiaoyu Zhang,Xiaohui Duan
DOI: https://doi.org/10.5220/0010800800003124
2022-01-01
Abstract:To improve the performance of segmentation networks on video streaming, most researchers now use opticalflow based method and non optical-flow CNN based method. The former suffers from heavy computational cost and high latency while the latter suffers from poor applicability and versatility. In this paper, we design a Partial Channel Memory Attention module (PCMA) to store and fuse time series features from video sequences.Then, we propose a Memory Attention ResNet50 network (MA-ResNet50) by combining the PCMA module with ResNet50, making it the first video based feature extraction encoder appliable for most of the currently proposed segmentation networks. For experiments, we combine our MA-ResNet50 with four acknowledged per-frame segmentation networks: DeeplabV3P, PSPNet, SFNet, and DNLNet. The results show that our MA-ResNet50 outperforms the original ResNet50 generally in these 4 networks on VSPW and CamVid. Our method also achieves state-of-the-art accuracy on CamVid. The code is avilable at https://github.com/xiaotianliu01/MA-Resnet50.
What problem does this paper attempt to address?