Abstract:We present a novel Structure from Motion pipeline that is capable of reconstructing accurate camera poses for panorama-style video capture without prior camera intrinsic calibration. While panorama-style capture is common and convenient, previous reconstruction methods fail to obtain accurate reconstructions due to the rotation-dominant motion and small baseline between views. Our method is built on the assumption that the camera motion approximately corresponds to motion on a sphere, and we introduce three novel relative pose methods to estimate the fundamental matrix and camera distortion for spherical motion. These solvers are efficient and robust, and provide an excellent initialization for bundle adjustment. A soft prior on the camera poses is used to discourage large deviations from the spherical motion assumption when performing bundle adjustment, which allows cameras to remain properly constrained for optimization in the absence of well-triangulated 3D points. To validate the effectiveness of the proposed method we evaluate our approach on both synthetic and real-world data, and demonstrate that camera poses are accurate enough for multiview stereo.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to provide accurate camera pose reconstruction for panoramic - style video capture without pre - calibrating the camera internal parameters. Specifically, the paper proposes solutions to the following problems: 1. **Rotation - dominated motion and small baselines between views**: Traditional Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) methods are not effective in processing panoramic - style videos because the shooting mode of such videos is mainly rotational motion, resulting in limited overlapping areas between images and small baselines between views, which makes the triangulation of 3D points unstable. 2. **Limitations of mobile phone cameras**: The field of view of mobile phone cameras is limited, and there are problems such as motion blur and rolling shutter, which further exacerbate the difficulty of reconstruction. 3. **Lack of online feedback**: Untrained users can hardly capture the scene constraints required for high - quality SfM reconstruction without online feedback. ### Main contributions of the paper To solve the above problems, the paper proposes a novel SfM pipeline that can efficiently reconstruct the camera pose and geometric structure in panoramic - style videos shot by handheld devices. The main contributions include: - **Assume that the camera motion is approximately spherical motion**: By modeling the camera motion as motion on a spherical surface, the degrees of freedom are reduced, thereby simplifying the relative pose estimation. - **Introduce three new relative pose estimation methods**: These methods can effectively estimate the fundamental matrix and camera distortion and are suitable for spherical motion. - **Soft prior constraints**: Use soft priors in the bundle adjustment process to encourage the camera pose to remain close to the spherical motion assumption and prevent large deviations. - **No need to pre - calibrate the camera**: The system can work without pre - calibrating the camera internal parameters, increasing the versatility and practicality of the system. ### Experimental verification To verify the effectiveness of the proposed method, the authors conducted experiments on synthetic data and real - world data and compared them with existing methods. The experimental results show that this method has higher efficiency and accuracy in processing panoramic - style videos, especially when facing rotation - dominated motion and small baselines between views. ### Summary of mathematical formulas - **Camera projection matrix**: \[ P = K[R|-z] \] where \(K\) is the camera internal parameter matrix, \(R\) is the rotation matrix, and \(z = [0, 0, 1]^\top\). - **Camera internal parameter matrix**: \[ K=\begin{bmatrix} f&0&0\\ 0&f&0\\ 0&0&1 \end{bmatrix} \] - **Essential matrix**: \[ E = \begin{bmatrix} e_1&e_2&e_3\\ e_2&-e_1&e_4\\ e_5&e_6&0 \end{bmatrix} \] - **Relationship between the fundamental matrix and the essential matrix**: \[ E = K^\top FK \] - **Radial distortion model**: \[ p_u(\lambda)=\begin{bmatrix} x_d\\ y_d\\ 1+\lambda(x_d^2 + y_d^2) \end{bmatrix} \] Through these methods and models, the paper successfully solves the key problems in panoramic - style video reconstruction and provides a more accurate and efficient solution.

Structure from Motion for Panorama-Style Videos

Two-Stage Multi-Camera Constrain Mapping Pipeline for Large-Scale 3D Reconstruction

Structure from Motion on a Sphere

Multi-Viewpoint Panorama Construction with Wide-Baseline Images

VidSfM: Robust and Accurate Structure-From-Motion for Monocular Videos.

TC-SfM: Robust Track-Community-Based Structure-from-Motion

Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array.

MASt3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion

Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows

Cylindrical Panoramic Mosaicing from a Pipeline Video Through MRF Based Optimization.

3D Reconstruction of Spherical Images based on Incremental Structure from Motion

Panoramic 3D Reconstruction Using Rotational Stereo Camera with Simple Epipolar Constraints

PatchMatch-Stereo-Panorama, a fast dense reconstruction from 360° video images

Constrained Bundle Adjustment for Structure From Motion Using Uncalibrated Multi-Camera Systems

Creating Cylindrical Panoramic Mosaic from a Pipeline Video

Constructing Big Panorama from Video Sequence Based on Deep Local Feature

Structure from Articulated Motion: Accurate and Stable Monocular 3D Reconstruction without Training Data

Robust Plane-Based Structure From Motion

3D panorama reconstruction based on sitemap joining

3d Panorama Reconstruction Based On Submap Joining