Optical Flow Estimation Using Dual Self-Attention Pyramid Networks

Mingliang Zhai,Xuezhi Xiang,Rongfang Zhang,Ning Lv,Abdulmotaleb El Saddik
DOI: https://doi.org/10.1109/tcsvt.2019.2943140
IF: 5.859
2020-10-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Recently, optical flow estimation benefits greatly from deep learning based techniques. Most approaches use encoder-decoder architecture (U-Net) or spatial pyramid network (SPN) to learn optical flow. Both U-Net and SPN can extract multi-scale features and can predict optical flow directly. However, existing networks ignore to exploit the global information among channel features and inter-spatial relationship of features. In this paper, we propose a dual self-attention pyramid network, which adaptively integrates local features with their global dependencies and focuses on important features and suppresses unimportant features. Specifically, we introduce two types of attention modules into SPN, which emphasizes meaningful features along channel and spatial axes. The channel attention can adaptively re-weight channel-wise features by considering interdependencies among channels. Moreover, the spatial attention can utilize global contextual information to emphasize or suppress features in different spatial locations. In addition, two attention modules are embedded into each pyramidal level, which can refine features at different scale. We evaluate our method on MPI-Sintel and KITTI. The experimental results show that using the dual self-attention module can improve the representation power of network and further increase the accuracy of optical flow estimation.
engineering, electrical & electronic
What problem does this paper attempt to address?