Abstract:The panorama stitching system is an indispensable module in surveillance or space exploration. Such a system enables the viewer to understand the surroundings instantly by aligning the surrounding images on a plane and fusing them naturally. The bottleneck of existing systems mainly lies in alignment and naturalness of the transition of adjacent images. When facing dynamic foregrounds, they may produce outputs with misaligned semantic objects, which is evident and sensitive to human perception. We solve three key issues in the existing workflow that can affect its efficiency and the quality of the obtained panoramic video and present Pedestrian360, a panoramic video system based on a structured camera array (a spatial surround-view camera system). First, to get a geometrically aligned 360○ view in the horizontal direction, we build a unified multi-camera coordinate system via a novel refinement approach that jointly optimizes camera poses. Second, to eliminate the brightness and color difference of images taken by different cameras, we design a photometric alignment approach by introducing a bias to the baseline linear adjustment model and solving it with two-step least-squares. Third, considering that the human visual system is more sensitive to high-level semantic objects, such as pedestrians and vehicles, we integrate the results of instance segmentation into the framework of dynamic programming in the seam-cutting step. To our knowledge, we are the first to introduce instance segmentation to the seam-cutting problem, which can ensure the integrity of the salient objects in a panorama. Specifically, in our surveillance oriented system, we choose the most significant target, pedestrians, as the seam avoidance target, and this accounts for the name Pedestrian360 . To validate the effectiveness and efficiency of Pedestrian360, a large-scale dataset composed of videos with pedestrians in five scenes is established. The test results on this dataset demonstrate the superiority of Pedestrian360 compared to its competitors. Experimental results show that Pedestrian360 can stitch videos at a speed of 12 to 26 fps, which depends on the number of objects in the shooting scene and their frequencies of movements. To make our reported results reproducible, the relevant code and collected data are publicly available at https://cslinzhang.github.io/Pedestrian360-Homepage/ .

Pano-SfMLearner: Self-Supervised Multi-Task Learning of Depth and Semantics in Panoramic Videos.

Omnisupervised Omnidirectional Semantic Segmentation

Can We PASS Beyond the Field of View? Panoramic Annular Semantic Segmentation for Real-World Surrounding Perception

Semantic Visual Odometry Based on Panoramic Annular Imaging

Self-supervised Learning of Depth and Camera Motion from 360 $$^\circ $$ Videos

Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array.

PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth Estimation

Multi-Viewpoint Panorama Construction with Wide-Baseline Images

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization

Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic Video with Applications for Virtual Reality

Elite360M: Efficient 360 Multi-task Learning via Bi-projection Fusion and Cross-task Collaboration

MODE: Multi-view Omnidirectional Depth Estimation with 360-degree Cameras

Self-supervised Learning of Monocular 3D Geometry Understanding with Two- and Three-View Geometric Constraints

Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation

3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose From Monocular Video

Joint Self-Supervised Learning of Interest Point, Descriptor, Depth, and Ego-Motion from Monocular Video

Stereo Matching by Self-supervision of Multiscopic Vision.

Collaborative Learning of Depth Estimation, Visual Odometry and Camera Relocalization from Monocular Videos.

Lightweight Super-Resolution with Self-Calibrated Convolution for Panoramic Videos

See360: Novel Panoramic View Interpolation