Abstract:Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world. It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods, because thin structures often lack distinct point features and have severe self-occlusion. We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera. Specifically, we present a new curve-based approach to estimate accurate camera poses by establishing correspondences between featureless thin objects in the foreground in consecutive video frames, without requiring visual texture in the background scene to lock on. Enabled by this effective curve-based camera pose estimation strategy, we develop an iterative optimization method with tailored measures on geometry, topology as well as self-occlusion handling for reconstructing 3D thin structures. Extensive validations on a variety of thin structures show that our method achieves accurate camera pose estimation and faithful reconstruction of 3D thin structures with complex shape and topology at a level that has not been attained by other existing reconstruction methods.

What problem does this paper attempt to address?

### Problems the paper attempts to solve The paper "Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video" aims to solve the problem of simultaneously estimating camera motion and reconstructing complex 3D slender structures from hand - held RGB videos. Specifically, the paper focuses on how to efficiently and accurately estimate camera motion parameters from an ordinary RGB video and reconstruct the 3D geometry and topology of slender structures (such as wire - frame sculptures, fences, cables, wires, and tree branches, etc.). ### Background and challenges Slender structures are very common in the real world, but it is very challenging to obtain 3D digital models of these structures using traditional image - or depth - based methods. The reasons are as follows: 1. **Lack of feature points**: Slender structures usually lack obvious point features, which makes the classical correspondence matching methods perform poorly or completely fail. 2. **Severe self - occlusion**: Slender structures often have self - occlusion, which further increases the difficulty of reconstruction. 3. **Small size**: Slender structures are usually only a few pixels wide, and even a tiny camera calibration error can seriously affect the accuracy of reconstruction. ### Solutions To solve the above problems, the authors propose a new curve - based method that can simultaneously estimate camera motion and reconstruct the 3D geometry of slender structures. Specific technical contributions include: 1. **Curve - based camera pose estimation**: - Automatically estimate camera pose by establishing correspondences of feature - less slender objects between consecutive video frames. - This method does not require an initial camera pose, nor does it require the assumption of the existence of point features or texture features. 2. **Effective iterative optimization method**: - Develop an iterative optimization method for reconstructing 3D slender structures. This method is specifically designed for geometry, topology, and self - occlusion handling. - High - precision reconstruction is achieved by gradually adding input frames and updating camera pose and 3D curve network. ### Method overview 1. **Pre - processing**: - Segment slender structures from the input video to generate binary masks and corresponding 2D curves. - Use an image thinning method to extract one - pixel - wide medial axis curves. 2. **First stage: Camera pose estimation and curve network reconstruction**: - Simultaneously estimate all camera poses and 3D curve networks by minimizing the distance between projected curves and 2D curves. - In the initial step, select two image pairs with sufficient camera movement for initialization. - Establish correspondences between 3D curve points and 2D curve points by the dynamic programming method. 3. **Second stage: Surface geometry reconstruction**: - Reconstruct the final surface of the slender structure by sweeping a disk along the 3D skeleton curve. - The radius of the disk is estimated by fusing the 2D curve radii in all input images. ### Results and contributions This method has been widely verified on a variety of slender structures, and the results show that it achieves an unprecedented high precision in camera pose estimation and 3D slender structure reconstruction. In particular, this method can handle complex shapes and topologies, which are difficult to achieve with existing methods. ### Formula display 1. **Objective function**: \[ F(\{R_k, T_k\}; C)=\sum_{k} \text{dist}^2(c'_k, c_k), \] where \( c'_k = \pi(R_k, T_k; C) \) represents the projection of the 3D curve network \( C \) in the view \( I_k \), and \(\text{dist}^2(c'_k, c_k)\) represents the one - sided integral squared distance between the curve \( c'_k \) and the curve \( c_k \) in the 2D image plane: \[ \text{dist}^2(c'_k, c_k)=\int_{p \in c'_k} \min_{q \in c_k}\|p - q\|^2_2, \]

Vid2Curve

Vid2Curve: simultaneous camera motion estimation and thin structure reconstruction from an RGB video

Vid2Curve: Simultaneously Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video

A Multiscale-Contour-based Interpolation Framework for Generating a Time-Varying Quasi-Dense Point Cloud Sequence.

Precise 3d reconstruction from a single image

Outdoor Scene 3D Reconstruction from Multiple Point Cloud

Recovering 3D Planar Arrangements from Videos

Video-Based Outdoor Human Reconstruction.

Reconstruction Of 3-D Symmetric Curves From Perspective Images Without Discrete Features

From Multiview Image Curves to 3D Drawings

UnstructuredFusion: Realtime 4D Geometry and Texture Reconstruction Using Commercial RGBD Cameras.

Robust 3D Reconstruction with an RGB-D Camera

CRF-Based Reconstruction from Narrow-Baseline Image Sequences.

3d Reconstruction Of Dynamic Scenes With Multiple Handheld Cameras

DreaMo: Articulated 3D Reconstruction From A Single Casual Video

Construction 3 D Panoramic Model of Natural Scene from Real Image Sequences

Fast and robust curve skeletonization for real-world elongated objects

CurveCloudNet: Processing Point Clouds with 1D Structure

Bodyfusion: Real-Time Capture Of Human Motion And Surface Geometry Using A Single Depth Camera

Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows

VIDAR: Data Quality Improvement for Monocular 3D Reconstruction Through In-situ Visual Interaction