Abstract:3D sensing for monocular in-the-wild images, e.g., depth estimation and 3D object detection, has become increasingly important. However, the unknown intrinsic parameter hinders their development and deployment. Previous methods for the monocular camera calibration rely on specific 3D objects or strong geometry prior, such as using a checkerboard or imposing a Manhattan World assumption. This work solves the problem from the other perspective by exploiting the monocular 3D prior. Our method is assumption-free and calibrates the complete $4$ Degree-of-Freedom (DoF) intrinsic parameters. First, we demonstrate intrinsic is solved from two well-studied monocular priors, i.e., monocular depthmap, and surface normal map. However, this solution imposes a low-bias and low-variance requirement for depth estimation. Alternatively, we introduce a novel monocular 3D prior, the incidence field, defined as the incidence rays between points in 3D space and pixels in the 2D imaging plane. The incidence field is a pixel-wise parametrization of the intrinsic invariant to image cropping and resizing. With the estimated incidence field, a robust RANSAC algorithm recovers intrinsic. We demonstrate the effectiveness of our method by showing superior performance on synthetic and zero-shot testing datasets. Beyond calibration, we demonstrate downstream applications in image manipulation detection & restoration, uncalibrated two-view pose estimation, and 3D sensing. Codes, models, and data will be held in <a class="link-external link-https" href="https://github.com/ShngJZ/WildCamera" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper attempts to address the problem of camera intrinsic calibration in in-the-wild monocular images. Specifically, the authors point out that although 3D perception technologies such as monocular depth estimation and 3D object detection have rapidly developed in in-the-wild images, their application is still limited by unknown camera intrinsics. Traditional monocular camera calibration methods rely on specific 3D objects or strong geometric priors (such as using a checkerboard or Manhattan world assumption), but these conditions are often not met in in-the-wild images. To solve this problem, the authors propose a new method for camera intrinsic calibration using monocular 3D priors (such as depth maps and surface normal maps). This method introduces a new concept called the "incidence field" to directly parameterize the camera intrinsics and uses a deep neural network to learn the incidence field, then recovers the complete 4 degrees of freedom (4DoF) intrinsics from the estimated incidence field through the RANSAC algorithm. ### Main Contributions: 1. **Novel Perspective**: This method addresses the monocular camera calibration problem from the perspective of monocular 3D priors without making any assumptions about the input image. 2. **Robustness**: The algorithm provides robust monocular intrinsic estimation in in-the-wild images and has been extensively benchmarked and compared with other baseline methods. 3. **Downstream Applications**: The method's application in various downstream tasks is demonstrated, including image cropping and scaling detection and recovery, uncalibrated two-view pose estimation, etc. ### Method Overview: 1. **Monocular Intrinsic Calibration**: Utilizes the consistency of monocular depth maps and surface normal maps to estimate intrinsics, but this method has numerical instability issues. 2. **Incidence Field**: Introduces the incidence field as a new monocular 3D prior, which is a pixel-level intrinsic parameterization invariant to image cropping and scaling. 3. **Network Training**: Employs a deep neural network to learn the incidence field and recovers the intrinsics from the estimated incidence field through the RANSAC algorithm. 4. **Downstream Applications**: Demonstrates the method's application in various 3D perception tasks, such as depth map to point cloud conversion, uncalibrated two-view pose estimation, etc. ### Experimental Results: - Extensive experiments were conducted on multiple public datasets, demonstrating the superior performance of the method in in-the-wild monocular camera calibration. - Compared to existing methods, this method shows higher accuracy and robustness in various scenarios. ### Conclusion: The paper proposes a novel and effective monocular camera calibration method that can handle the unknown intrinsic problem in in-the-wild images, providing important support for the promotion of 3D perception technologies in practical applications.

Tame a Wild Camera: In-the-Wild Monocular Camera Calibration

Stereo Calibration and Rectification for Omnidirectional Multi-Camera Systems

Calibration of Central Omnidirectional Cameras Via the Viewing Sphere

Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration

Calibration-free Deep Optics for Depth Estimation with Precise Simulation

Unconstrained Self-Calibration of Stereo Camera on Visually Impaired Assistance Devices.

Precise and Robust Binocular Camera Calibration Based on Multiple Constraints

Accurate Intrinsic Calibration Of Depth Camera With Cuboids

NMC3D: Non-Overlapping Multi-Camera Calibration Based on Sparse 3D Map

CamLessMonoDepth: Monocular Depth Estimation with Unknown Camera Parameters

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

A Hybrid Calibration Method for the Binocular Omnidirectional Vision System

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

Automatic Surround Camera Calibration Method in Road Scene for Self-driving Car

A Convenient and High-Accuracy Multicamera Calibration Method Based on Imperfect Spherical Objects

Self-Supervised Camera Self-Calibration from Video

Monocular Human-Object Reconstruction in the Wild

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach

3D Object Aided Self-Supervised Monocular Depth Estimation

Camera Calibration using a Collimator System

Capturing Human Motion from Monocular Images in World Space with Weak-supervised Calibration