Mesh deformation-based single-view 3D reconstruction of thin eyeglasses frames with differentiable rendering

Fan Zhang,Ziyue Ji,Weiguang Kang,Weiqing Li,Zhiyong Su
DOI: https://doi.org/10.1016/j.gmod.2024.101225
2024-08-10
Abstract:With the support of Virtual Reality (VR) and Augmented Reality (AR) technologies, the 3D virtual eyeglasses try-on application is well on its way to becoming a new trending solution that offers a "try on" option to select the perfect pair of eyeglasses at the comfort of your own home. Reconstructing eyeglasses frames from a single image with traditional depth and image-based methods is extremely difficult due to their unique characteristics such as lack of sufficient texture features, thin elements, and severe self-occlusions. In this paper, we propose the first mesh deformation-based reconstruction framework for recovering high-precision 3D full-frame eyeglasses models from a single RGB image, leveraging prior and domain-specific knowledge. Specifically, based on the construction of a synthetic eyeglasses frame dataset, we first define a class-specific eyeglasses frame template with pre-defined keypoints. Then, given an input eyeglasses frame image with thin structure and few texture features, we design a keypoint detector and refiner to detect predefined keypoints in a coarse-to-fine manner to estimate the camera pose accurately. After that, using differentiable rendering, we propose a novel optimization approach for producing correct geometry by progressively performing free-form deformation (FFD) on the template mesh. We define a series of loss functions to enforce consistency between the rendered result and the corresponding RGB input, utilizing constraints from inherent structure, silhouettes, keypoints, per-pixel shading information, and so on. Experimental results on both the synthetic dataset and real images demonstrate the effectiveness of the proposed algorithm.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to address the challenge of reconstructing a 3D model of thin eyeglass frames from a single image. Specifically, due to the following characteristics of eyeglass frames: 1. **Lack of sufficient texture features**: Eyeglass frames usually have smooth surfaces and uniform colors, lacking texture information that can be extracted for features. 2. **Slender structures**: Eyeglass frames contain many slender parts, such as temples and nose pads, with widths typically less than 5 millimeters. 3. **Severe self-occlusion**: Some parts of the eyeglass frames may be occluded by other parts, making it difficult to obtain complete information during reconstruction. Existing reconstruction methods based on depth sensors or images face significant challenges when dealing with these characteristics. Therefore, this paper proposes a mesh deformation-based reconstruction framework that utilizes differentiable rendering technology to recover a high-precision 3D model of full-frame eyeglasses from a single RGB image. This framework defines a specific eyeglass frame template and combines prior knowledge and domain-specific knowledge to gradually deform the template to generate the final 3D model.