Video Face Recognition Using Neural Aggregation Networks with Mutual Relational Learning.

Kangli Zeng,Zhongyuan Wang,Tao Lu,Jianyu Chen
DOI: https://doi.org/10.1109/ICTAI56018.2022.00104
2022-01-01
Abstract:Video face recognition benefits profoundly from deep convolutional neural networks (CNNs), which learn robust feature embeddings. However, due to their fixed geometric structures, CNNs are inherently limited in modeling the significant variations from the angle, pose, occlusion and other factors of face images. In this paper, a neural aggregation network based on mutual relation learning is proposed for video face recognition. First, Intra-frame Relational Learning network (Intra-Net) is introduced, which models the interdependencies between the regional components of individual features and develops relevance between fine-grained features. Such processing can determine the region of interest adaptively according to the quality of the input face image to achieve the extraction of valuable information. Secondly, we introduce Inter-frame Relational Learning Network (Inter-Net), which considers the most significant appearance representation in the overall structure of the face image to correlate the complementarity of features between frames. Finally, information aggregation is performed by combining Inter-Net and Intra-Net. Joint optimization of the two branches allows our model to effectively exploit the complementary information between them to improve the aggregation capability. We validate the effectiveness of our model for video face recognition, proving its superiority over state-of-the-art methods on two benchmark datasets.
What problem does this paper attempt to address?