HybridAvatar: Efficient Mesh-based Human Avatar Generation from Few-Shot Monocular Images with Implicit Mesh Displacement

Tianxing Fan,Bangbang Yang,Chong Bao,Lei Wang,Guofeng Zhang,Zhaopeng Cui
DOI: https://doi.org/10.1109/ismar-adjunct60411.2023.00080
2023-01-01
Abstract:Efficient and controllable human avatar generation from few-shot images of a commodity-level camera is one of the desired functions for AR/VR applications. However, existing methods either rely on a parametric model like SMPL, which cannot closely fit the real shape and also lacks geometric details (e.g., wrinkles in clothes), or utilize neural implicit functions to recover details while requiring careful data collection and intensive computation, which prohibits the model deployment into real-world applications. In this paper, we propose a novel framework, called HybridAvatar, for 3D human avatar generation from monocular RGB images, which not only recovers detailed geometries, but also supports pose animation with low-cost computation. Different from previous works, our method takes advantage of both the explicit parametric model and the neural implicit function, and learns a data-driven implicit displacement field to complement details upon the parametric mesh model. To achieve both high-fidelity modeling and instant inference, we design a cascaded mechanism to model body shapes in a two-stage manner and propose a spherical harmonics-based differentiable texturing process to encode human appearances. With the advantage of mesh model and texture-based rendering strategy, we achieve fast rendering of human animations in VR/AR applications. Experiments demonstrate the improved 3D human body modeling performance of our method over SOTA approaches under few-shot RGB images and the ability to animate human avatars efficiently.
What problem does this paper attempt to address?