Sub-pixel Convolution and Edge Detection for Multi-view Stereo

Fanqi Yu,Jieyu Pang,Ronggang Wang
DOI: https://doi.org/10.1109/iccc56324.2022.10066039
2022-01-01
Abstract:The deep multi-view stereo (MVS) approaches generally construct a cost volume pyramid in a coarse- to- fine manner to regularize and regress the depth or disparity, which is often built upon a feature pyramid encoding geometry or an image pyramid. A pyramid is an excellent approach to reducing memory, and many papers said even low-resolution images or features contain enough information for estimating low-resolution depth maps. However, recent papers show that the higher the image resolution, the better the output depth map, which means the resolution of depth maps in each stage cause effect on the final outputs. Therefore, we think the low-resolution depth map may not be enough for the high-resolution depth map. In this paper, we propose a sub-pixel upsampling module for post-processing the cost volume to generate a big resolution depth map at each stage. Besides, we also proposed an edge-weighted loss function for optimizing those inaccurate depth values in the edge regions of objects. Finally, we implement them on CasMVSNet, showing the effectiveness of our proposed method. The content of abstract.
What problem does this paper attempt to address?