Pyramid Structured Optical Flow Learning with Motion Cues

Ji Dai,Shiyuan Huang,Truong Nguyen
DOI: https://doi.org/10.1109/icip.2018.8451284
2018-10-01
Abstract:After the introduction of FlowNet and the large scale synthetic dataset Flying Chairs, we witnessed a rapid growth of deep learning based optical flow estimation algorithms. However, most of these algorithms rely on a very deep network to learn both large and small motions, making them less efficient. They also process each frame individually for the video dataset like MPI Sintel without using temporally correlated information across frames. This paper presents a pyramid structured network that estimates the optical flow from coarse to fine. We use a much shallower subnetwork at each pyramid level to predict an incremental flow, which contains relatively small motions, based on higher level's prediction. For video dataset, the network utilizes motion cues from previous frames' estimations for assistance. Evaluations show that the proposed network outperforms FlowNet on multiple benchmarks and has a slight edge on other similar pyramid structure networks. The shallow network design shrinks the parameter size by 88% comparing to FlowNet, allowing it to reach almost 100 frames per second prediction speed.
What problem does this paper attempt to address?