3D Face Reconstruction Based on ResNet Feature Extraction and CBAM
Zhichao Xue,Tianxing Yan,Yaermaimaiti Yilihamu,Yuhang Zhao
DOI: https://doi.org/10.1142/s0219467825500743
2024-07-30
International Journal of Image and Graphics
Abstract:In view of the scarcity, high cost and lack of diversity of three-dimensional (3D) face datasets, this paper designs an end-to-end self-supervised learning 3D face reconstruction network, which uses single 2D face image as input. The model bypasses the 3D face datasets and only uses the 2D face datasets for training to achieve high-precision 3D face reconstruction without any 3D face prior. First, the improved ResNet50 feature extraction module is introduced to extract and characterize the input image by deep convolutional network. Then, a lightweight convolutional block attention module is added to the face prediction subnetwork. On the one hand, channel attention extracts different information included in the image, and on the other hand spatial attention finds the location of the information. So, the serialized attention operation could accurately find the features required for different parameter predictions, further improving face reconstruction parameters’ prediction accuracy. Finally, training, ablation and comparison experiments were conducted on CelebFaces Attributes, basel face model and Photoface datasets, and the combined loss function of pixel loss and perception loss was selected. The pixel loss function was calculated at the pixel microscopic level, and the perception loss function was calculated at the image macroscopic convolution level. The combination of the two could complement each other. Compared with the historical optimal results of the same network structure, the scale-invariant depth error and mean angle deviation of the proposed algorithm are improved by 5.2% and 8.2%, respectively. Experimental results strongly prove the effectiveness of the algorithm.
Computer Science