Abstract:Multi-view 3D reconstruction generally adopts the feature fusion strategy to guide the generation of 3D shape for objects with different views. Empirically, the correspondence learning of object regions across different views enables better feature fusion. However, such idea has not been fully exploited in existing methods. Furthermore, current methods fail to explore the intrinsic dependency among regions within a 3D shape, leading to a rough reconstruction result. To address the above issues, we propose a Dual-View 3D Point Cloud reconstruction architecture named DVPC, which takes two views images as inputs, and progressively generates a refined 3D point cloud. First, a point cloud generation network is assigned to generate a coarse point cloud for each input view. Second, a dual-view point clouds synthesis network is presented in DVPC. It constructs a regional attention mechanism to learn a high-quality correspondence among regions across two coarse point clouds in different views, so that our DVPC can achieve feature fusion accurately. And then it develops a point cloud deformation module to produce a relatively-precise point cloud via establishing the communication between the coarse point cloud and the fused feature. Lastly, a point-region transformer network is devised to model the dependency among regions within the relatively-precise point cloud. With the dependency, the relatively-precise point cloud is refined into a desirable 3D point cloud with rich details. Qualitative and quantitative experiments on the ShapeNet and Pix3D datasets demonstrate that the proposed DVPC outperforms the state-of-the-art methods in terms of reconstruction quality.

VIPNet: A Fast and Accurate Single-View Volumetric Reconstruction by Learning Sparse Implicit Point Guidance

3D-RVP: A method for 3D object reconstruction from a single depth view using voxel and point

DV-Net: Dual-view Network for 3D Reconstruction by Fusing Multiple Sets of Gated Control Point Clouds

Multi-View Stereo Representation Revist: Region-Aware MVSNet

3D Reconstruction for Multi-view Objects

Efficient Implicit Neural Reconstruction Using LiDAR

MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image

Dual-View 3D Reconstruction via Learning Correspondence and Dependency of Point Cloud Regions

3DVNet: Multi-View Depth Prediction and Volumetric Refinement

Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for Multi-view Reconstruction

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

IPVNet: Learning Implicit Point-Voxel Features for Open-Surface 3D Reconstruction

Res-NeuS: Deep Residuals and Neural Implicit Surface Learning for Multi-View Reconstruction

ProbIBR: Fast Image-Based Rendering with Learned Probability-Guided Sampling

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Learning Neural Implicit through Volume Rendering with Attentive Depth Fusion Priors

Neural 3D reconstruction from sparse views using geometric priors

2L3: Lifting Imperfect Generated 2D Images into Accurate 3D

Efficient Neural Representation of Volumetric Data using Coordinate-Based Networks

Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction