Abstract:The panorama stitching system is an indispensable module in surveillance or space exploration. Such a system enables the viewer to understand the surroundings instantly by aligning the surrounding images on a plane and fusing them naturally. The bottleneck of existing systems mainly lies in alignment and naturalness of the transition of adjacent images. When facing dynamic foregrounds, they may produce outputs with misaligned semantic objects, which is evident and sensitive to human perception. We solve three key issues in the existing workflow that can affect its efficiency and the quality of the obtained panoramic video and present Pedestrian360, a panoramic video system based on a structured camera array (a spatial surround-view camera system). First, to get a geometrically aligned 360○ view in the horizontal direction, we build a unified multi-camera coordinate system via a novel refinement approach that jointly optimizes camera poses. Second, to eliminate the brightness and color difference of images taken by different cameras, we design a photometric alignment approach by introducing a bias to the baseline linear adjustment model and solving it with two-step least-squares. Third, considering that the human visual system is more sensitive to high-level semantic objects, such as pedestrians and vehicles, we integrate the results of instance segmentation into the framework of dynamic programming in the seam-cutting step. To our knowledge, we are the first to introduce instance segmentation to the seam-cutting problem, which can ensure the integrity of the salient objects in a panorama. Specifically, in our surveillance oriented system, we choose the most significant target, pedestrians, as the seam avoidance target, and this accounts for the name Pedestrian360 . To validate the effectiveness and efficiency of Pedestrian360, a large-scale dataset composed of videos with pedestrians in five scenes is established. The test results on this dataset demonstrate the superiority of Pedestrian360 compared to its competitors. Experimental results show that Pedestrian360 can stitch videos at a speed of 12 to 26 fps, which depends on the number of objects in the shooting scene and their frequencies of movements. To make our reported results reproducible, the relevant code and collected data are publicly available at https://cslinzhang.github.io/Pedestrian360-Homepage/ .

Towards Panoptic 3D Parsing for Single Image in the Wild

Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos

Can We PASS Beyond the Field of View? Panoramic Annular Semantic Segmentation for Real-World Surrounding Perception

PASS: Panoramic Annular Semantic Segmentation

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization

Panoptic 3D Scene Reconstruction From a Single RGB Image

PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes

Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning

PanoRecon: Real-Time Panoptic 3D Reconstruction from Monocular Video

Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array.

Panoptic Lifting for 3D Scene Understanding with Neural Fields

In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding

Accurate and Efficient 3D Panoptic Mapping Using Diverse Information Modalities and Multidimensional Data Association

EPRecon: An Efficient Framework for Real-Time Panoptic 3D Reconstruction from Monocular Video

PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction

BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image

Unified Perceptual Parsing for Scene Understanding

PanDepth: Joint Panoptic Segmentation and Depth Completion

Automatic 3D Indoor Scene Modeling from Single Panorama