Scale-flow: Estimating 3D Motion from Video

Han Ling,Quansen Sun,Zhenwen Ren,Yazhou Liu,Hongyuan Wang,Zichen Wang
DOI: https://doi.org/10.1145/3503161.3547979
2022-01-01
Abstract:This paper addresses the problem of normalized scene flow (NSF): given a pair of RGB video frames, estimating the 3D motion, which consisted of optical flow and motion-in-depth estimation. NSF is a powerful tool for action prediction and autonomous robot navigation, presenting the advantage of only needing a monocular and uncalibrated camera. However, most existing methods directly regress motion-in-depth from two RGB frames or optical flow, resulting in sub-accurate and non-robust results. Our key insight is the scale matching scheme-establishing correlations between two frames containing objects in different scales, to estimate dense and continuous motion-in-depth. Based on the scale matching, we propose a unified framework: Scale-flow, which combines scale matching and optical flow estimation. This combination makes optical flow estimation can use dense and continuous scale information for the first time, so that the moving foreground objects can be estimated more accurately. On KITTI, our monocular approach achieves the lowest error in the foreground scene flow task, even compared with the multi-camera method. Moreover, on the motion-in-depth estimation task, Scale-flow reduces the error by 34% compared with the best-published method. Code will be available(1).
What problem does this paper attempt to address?