Abstract:The depth images acquired by consumer depth sensors (e.g., Kinect and ToF) usually are of low resolution and insufficient quality. One natural solution is to incorporate a high resolution RGB camera and exploit the statistical correlation of its data and depth. In recent years, both optimization-based and learning-based approaches have been proposed to deal with the guided depth reconstruction problems. In this paper, we introduce a weighted analysis sparse representation (WASR) model for guided depth image enhancement, which can be considered a generalized formulation of a wide range of previous optimization-based models. We unfold the optimization by the WASR model and conduct guided depth reconstruction with dynamically changed stage-wise operations. Such a guidance strategy enables us to dynamically adjust the stage-wise operations that update the depth image, thus improving the reconstruction quality and speed. To learn the stage-wise operations in a task-driven manner, we propose two parameterizations and their corresponding methods: dynamic guidance with Gaussian RBF nonlinearity parameterization (DG-RBF) and dynamic guidance with CNN nonlinearity parameterization (DG-CNN). The network structures of the proposed DG-RBF and DG-CNN methods are designed with the the objective function of our WASR model in mind and the optimal network parameters are learned from paired training data. Such optimization-inspired network architectures enable our models to leverage the previous expertise as well as take benefit from training data. The effectiveness is validated for guided depth image super-resolution and for realistic depth image reconstruction tasks using standard benchmarks. Our DG-RBF and DG-CNN methods achieve the best quantitative results (RMSE) and better visual quality than the state-of-the-art approaches at the time of writing. The code is available at https://github.com/ShuhangGu/GuidedDepthSR.

Dust to Tower: Coarse-to-Fine Photo-Realistic Scene Reconstruction from Sparse Uncalibrated Images

Benchmarking Large-Scale Multi-View 3D Reconstruction Using Realistic Synthetic Images

Dense Point Clouds Matter: Dust-GS for Scene Reconstruction from Sparse Viewpoints

Hybrid-MVS: Robust Multi-View Reconstruction with Hybrid Optimization of Visual and Depth Cues

VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis

MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds

DUSt3R: Geometric 3D Vision Made Easy

Learned Dynamic Guidance for Depth Image Reconstruction

3d Reconstruction Of Dynamic Scenes With Multiple Handheld Cameras

Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture

Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis

Behind the Veil: Enhanced Indoor 3D Scene Reconstruction with Occluded Surfaces Completion

A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose

S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points

Data-Driven 3D Reconstruction of Dressed Humans From Sparse Views

SST: Real-time End-to-end Monocular 3D Reconstruction via Sparse Spatial-Temporal Guidance

From Chaos to Clarity: 3DGS in the Dark

FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models

3D Scene Reconstruction with Sparse LiDAR Data and Monocular Image in Single Frame

3D Reconstruction from a Single Still Image Based on Monocular Vision of an Uncalibrated Camera

DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes