One-shot lip-based biometric authentication: Extending behavioral features with authentication phrase information

Brando Koch,Ratko Grbić
DOI: https://doi.org/10.1016/j.imavis.2024.104900
IF: 3.86
2024-01-09
Image and Vision Computing
Abstract:Lip-based biometric authentication (LBBA) is an authentication method based on a person's lip movements during speech in the form of video data. LBBA can utilize both physical and behavioral characteristics of lip movements without requiring any additional sensory equipment apart from an RGB camera. Current approaches employ deep siamese neural networks trained with one-shot learning to generate embedding vectors from lip movement features. However, most of these approaches don't discriminate against speech content which makes them vulnerable to video replay attacks. Moreover, there is a lack of comprehensive analysis regarding the impact of distinct lip characteristics or difficult dataset phrases with significant word overlap on the performance of authentication in one-shot approaches. To address this, we introduce the GRID-CCP dataset and train a siamese neural network using 3D convolutions and recurrent neural network layers to additionally discriminate against speech content. For loss calculation, we propose a custom triplet loss function for efficient and customizable batch-wise hard-negative mining. Our experimental results, using an open-set protocol, demonstrate a False Acceptance Rate ( FAR ) of 3.2% and a False Rejection Rate ( FRR ) of 3.8% on the test set of the GRID-CCP dataset. Finally, we conduct an analysis to assess the influence and discriminative power of behavioral and physical features in LBBA.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, software engineering,optics
What problem does this paper attempt to address?