LP-DIF: Learning Local Pattern-Specific Deep Implicit Function for 3D Objects and Scenes

Meng Wang,Yu-Shen Liu,Yue Gao,Kanle Shi,Yi Fang,Zhizhong Han
DOI: https://doi.org/10.1109/cvpr52729.2023.02093
2023-01-01
Abstract:Deep Implicit Function (DIF) has gained much popularity as an efficient 3D shape representation. To capture geometry details, current mainstream methods divide 3D shapes into local regions and then learn each one with a local latent code via a decoder. Such local methods can capture more local details due to less diversity among local regions than global shapes. Although the diversity of local regions has been decreased compared to global approaches, the diversity in different local regions still poses a challenge in learning an implicit function when treating all regions equally using only a single decoder. What is worse, these local regions often exhibit imbalanced distributions, where certain regions have significantly fewer observations. This leads that fine geometry details could not be preserved well. To solve this problem, we propose a novel Local Pattern-specific Implicit Function, named LP-DIF, to represent a shape with clusters of local regions and multiple decoders, where each decoder only focuses on one cluster of local regions which share a certain pattern. Specifically, we first extract local codes for all regions, and then cluster them into multiple groups in the latent space, where similar regions sharing a common pattern fall into one group. After that, we train multiple decoders for mining local patterns of different groups, which simplifies the learning of fine geometric details by reducing the diversity of local regions seen by each decoder. To further alleviate the data-imbalance problem, we introduce a region re-weighting module to each pattern-specific decoder using a kernel density estimator, which dynamically re-weights the regions during learning. Our LP-DIF can restore more geometry details, and thus improve the quality of 3D reconstruction. Experiments demonstrate that our method can achieve the state-of-the-art performance over previous methods. Code is available at https://github.com/gtyxyz/lpdif.
What problem does this paper attempt to address?