Chfnet: a coarse-to-fine hierarchical refinement model for monocular depth estimation

Chen, Han
DOI: https://doi.org/10.1007/s00138-024-01560-0
IF: 2.983
2024-06-07
Machine Vision and Applications
Abstract:In recent years, many researchers have exploited multiple depth estimation architectures to produce high-quality depth maps from a single image. For monocular depth estimation, abundant multiscale features can significantly improve the prediction accuracy. Furthermore, multilevel refinement of the depth map through the model can effectively enhance the overall quality of the depth map. Therefore, we propose an efficient and effective module called light densely connected atrous spatial pyramid (LightDASP), which is employed to extract multiscale information at denser and larger scales from different levels of encoded features without significantly increasing the model size. Next, we propose a hierarchical reconstruction strategy that generates more accurate depth maps by refining the depth maps generated in the previous stage after each decoding stage. Additionally, to provide spatial location information to the decoder, the edge map is incorporated into the generation of a more rational refinement map. The experimental results, conducted on benchmark datasets in both indoor and outdoor scenes, demonstrate that our approach achieves efficient and competitive performance compared to existing methods for monocular depth estimation. We strike a balance between performance and efficiency, resulting in a model with greater potential for practical application. The code is available at https://github.com/ChenAndJiayi/CHFNet upon article acceptance.
computer science, cybernetics, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?