Automatic 4D Facial Expression Recognition Using Dynamic Geometrical Image Network

Weijian Li,Di Huang,Huibin Li,Yunhong Wang
DOI: https://doi.org/10.1109/fg.2018.00014
2018-05-01
Abstract:In this paper, we propose a novel Dynamic Geometrical Image Network (DGIN) for automatic 4D Facial Expression Recognition (FER). Given a 3D video represented as a sequence of face scans, we first estimate their differential geometry quantities and generate geometrical images, including Depth Images (DPI), three Normal Component Images (NCI) and Shape Index Images (SII). These geometrical images are then fed into DGIN for end-to-end training and prediction. DGIN consists of a short-term temporal pooling layer for dynamic geometric image generation, several repetitions of convolution-ReLU+pooling layers for facial spatial feature extraction, and a long-term temporal pooling layer for dynamic feature map fusion, followed by fully connected layers and a joint loss layer. During the training phase, the two-stage longterm and short-term sliding window scheme is introduced for data augmentation and temporal pooling. Meanwhile, a joint loss integrating both the cross-entropy loss and the triplet loss is used to achieve more discriminative expression features. In the testing phase, only the short-term sliding window scheme is applied to the whole video sequence of certain geometric images, whose outputs further go through the deep net for expression similarity measurement. The final result is achieved by fusing the predicted expression scores of different types of geometrical images. Experimental results reported on the BU-4DFE database demonstrate the effectiveness of the proposed approach.
What problem does this paper attempt to address?