OPTICAL FLOW ESTIMATION USING CHANNEL ATTENTION MECHANISM
Xiang Xuezhi,Syed Masroor Ali,Ghulam Farid
DOI: https://doi.org/10.1615/jflowvisimageproc.2019031771
2019-01-01
Journal of Flow Visualization and Image Processing
Abstract:Image processing-based optical flow computation has been an appalling task so far. Nowadays, convolution neural network (CNNs), a deep learning method, is broadly applied for estimating optical flow. One of the frameworks of CNNs is the U-Net architecture. This architecture is comprised of encoder-decoder framework that can be trained endways. However, its encoder component utilizes an identical image categorization technique which is common in other categorization architectures. Moreover, the decoder unit is employed to augment the spatial feature maps to full scale of intake by executing successive deconvolution. This type of architecture yields blurred flow fields owing to unpolished features and low-resolution as it is noteworthy that optical flow is pixel-level stint instead. In this article, it has been strived to find a way out for this problem. For this, two architectures, dilated convolution neural network, and channel attention mechanism are introduced inside FlowNetCorr network to estimate optical flow and training loss. In our framework, dilation convolution is deployed to attain spatial precision, and also elaborate the receptive field without requiring huge computational means, and keeps the spatial resolution of feature map unmodified, while channel attention is founded on squeeze-and-excitation architecture for image categorization, which can accommodatingly readjust channel-wise feature by applying global channel knowledge. Comprehensive experiments are executed on MPI-Sintel (Clean and Final), and KITTI (2012 and 2015) test datasets to assess the efficiency of our framework. The experimental outcome denotes that our framework has gone par in performance for MPI-Sintel (Clean and Final) datasets in terms of minimizing training loss, accuracy, and visual betterment to many unsupervised methods, e.g., USCNN, UnSupFlownet, DSTFlow, etc., and cutting-edge supervised methods, e.g., SpyNet, SpyNet+ft, CaF-Full-41c, etc. However, having degraded performance than FlowNet2, while for KITTI (2012 and 2015) datasets, our framework has also achieved better performances over many methods except from UnFlow+ft. These consequences affirm the significance of dilated convolution, and channel attention strategies as well for estimating optical flow.