Pano-SfMLearner: Self-Supervised Multi-Task Learning of Depth and Semantics in Panoramic Videos.

Mengyi Liu,Shuhui Wang,Yulan Guo,Yuan He,Hui Xue
DOI: https://doi.org/10.1109/lsp.2021.3073627
2021-01-01
IEEE Signal Processing Letters
Abstract:With the advent of virtual reality and augment reality applications, omnidirectional imaging and $360^{\circ }$ cameras become increasingly popular in many scenarios such as entertainment and autonomous systems. In this paper, we propose a self-supervised framework for multi-task learning on depth, camera motion and semantics from panoramic videos. Specifically, our method is based on differentiable warping of adjacent views to the target. Two improvements are provided. First, we introduce a view synthesis module based on equirectangular projection to enable direct optimization on panoramic images. Second, we introduce a self-supervised segmentation branch to involve the constraint of semantic consistency for further improvement. Extensive experiments on two $360^{\circ }$ video and two $360^{\circ }$ image datasets demonstrate that our method outperforms the state-of-the-art and achieves favorable cross-modality performance.
What problem does this paper attempt to address?