Structure from Motion for Panorama-Style Videos

Chris Sweeney,Aleksander Holynski,Brian Curless,Steve M Seitz
DOI: https://doi.org/10.48550/arXiv.1906.03539
2019-06-09
Abstract:We present a novel Structure from Motion pipeline that is capable of reconstructing accurate camera poses for panorama-style video capture without prior camera intrinsic calibration. While panorama-style capture is common and convenient, previous reconstruction methods fail to obtain accurate reconstructions due to the rotation-dominant motion and small baseline between views. Our method is built on the assumption that the camera motion approximately corresponds to motion on a sphere, and we introduce three novel relative pose methods to estimate the fundamental matrix and camera distortion for spherical motion. These solvers are efficient and robust, and provide an excellent initialization for bundle adjustment. A soft prior on the camera poses is used to discourage large deviations from the spherical motion assumption when performing bundle adjustment, which allows cameras to remain properly constrained for optimization in the absence of well-triangulated 3D points. To validate the effectiveness of the proposed method we evaluate our approach on both synthetic and real-world data, and demonstrate that camera poses are accurate enough for multiview stereo.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to provide accurate camera pose reconstruction for panoramic - style video capture without pre - calibrating the camera internal parameters. Specifically, the paper proposes solutions to the following problems: 1. **Rotation - dominated motion and small baselines between views**: Traditional Structure from Motion (SfM) or Simultaneous Localization and Mapping (SLAM) methods are not effective in processing panoramic - style videos because the shooting mode of such videos is mainly rotational motion, resulting in limited overlapping areas between images and small baselines between views, which makes the triangulation of 3D points unstable. 2. **Limitations of mobile phone cameras**: The field of view of mobile phone cameras is limited, and there are problems such as motion blur and rolling shutter, which further exacerbate the difficulty of reconstruction. 3. **Lack of online feedback**: Untrained users can hardly capture the scene constraints required for high - quality SfM reconstruction without online feedback. ### Main contributions of the paper To solve the above problems, the paper proposes a novel SfM pipeline that can efficiently reconstruct the camera pose and geometric structure in panoramic - style videos shot by handheld devices. The main contributions include: - **Assume that the camera motion is approximately spherical motion**: By modeling the camera motion as motion on a spherical surface, the degrees of freedom are reduced, thereby simplifying the relative pose estimation. - **Introduce three new relative pose estimation methods**: These methods can effectively estimate the fundamental matrix and camera distortion and are suitable for spherical motion. - **Soft prior constraints**: Use soft priors in the bundle adjustment process to encourage the camera pose to remain close to the spherical motion assumption and prevent large deviations. - **No need to pre - calibrate the camera**: The system can work without pre - calibrating the camera internal parameters, increasing the versatility and practicality of the system. ### Experimental verification To verify the effectiveness of the proposed method, the authors conducted experiments on synthetic data and real - world data and compared them with existing methods. The experimental results show that this method has higher efficiency and accuracy in processing panoramic - style videos, especially when facing rotation - dominated motion and small baselines between views. ### Summary of mathematical formulas - **Camera projection matrix**: \[ P = K[R|-z] \] where \(K\) is the camera internal parameter matrix, \(R\) is the rotation matrix, and \(z = [0, 0, 1]^\top\). - **Camera internal parameter matrix**: \[ K=\begin{bmatrix} f&0&0\\ 0&f&0\\ 0&0&1 \end{bmatrix} \] - **Essential matrix**: \[ E = \begin{bmatrix} e_1&e_2&e_3\\ e_2&-e_1&e_4\\ e_5&e_6&0 \end{bmatrix} \] - **Relationship between the fundamental matrix and the essential matrix**: \[ E = K^\top FK \] - **Radial distortion model**: \[ p_u(\lambda)=\begin{bmatrix} x_d\\ y_d\\ 1+\lambda(x_d^2 + y_d^2) \end{bmatrix} \] Through these methods and models, the paper successfully solves the key problems in panoramic - style video reconstruction and provides a more accurate and efficient solution.