Abstract:In the field of face anti-spoofing (FAS), how to extract the representative features to distinguish between real and spoof faces and train the corresponding deep networks are two vital issues. In this paper, we propose a simple but effective end-to-end FAS model based on an innovative texture extractor and a depth auxiliary supervision mechanism. In the feature extraction stage, we first design the residual gradient convolutions based on the redesigned gradient operators, which are used to extract fine-grained texture features. The extraction of texture features is based on multiple scales by dividing the texture differences between living and spoofing faces into three levels reasonably. Then we construct a multiscale residual gradient attention (MRGA) to obtain representative texture features from multiple levels texture features. By combining the proposed feature extractor MRGA and existing vision transformer (ViT), the MRGA-ViT is proposed to generate related semantics and obtain final classification results. In the training stage, we also propose a local depth auxiliary supervision based on a novel adjacent depth loss, which utilizes the correlation information of adjacent pixels adequately compared with traditional depth loss. The proposed MRGA-ViT model achieves competitive performance in generalization and stability ability, e.g., the ACER(%) values of intra testing on OULU-NPU database are 1.8, 2.6, 1.6 ± 1.2 and 1.9 ± 2.7 respectively, the AUC(%) of cross type testing attains 99.45 ± 0.57, the ACER(%) values of cross dataset testing are 28.1 and 36.7 respectively. Experimental results prove that the proposed model is competitive to other state-of-the-art works on generalization and stability performance.

Multi-modal Face Anti-spoofing Based on a Single Image

Selective Domain-Invariant Feature Alignment Network for Face Anti-Spoofing.

Multi-modal Face Anti-spoofing Using Multi-fusion Network and Global Depth-wise Convolution

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Multi-modal Face Anti-spoofing Using Channel Cross Fusion Network and Global Depth-Wise Convolution.

Face Anti-Spoofing with Human Material Perception

CG-FAS: Cross-label Generative Augmentation for Face Anti-Spoofing

Multi-modal Multi-layer Fusion Network with Average Binary Center Loss for Face Anti-spoofing

Flexible-Modal Face Anti-Spoofing: A Benchmark

Face anti-spoofing based on multi-modal and multi-scale features fusion

Deep Learning for Face Anti-Spoofing: A Survey

DiffFAS: Face Anti-Spoofing via Generative Diffusion Models

Multiscale Residual Gradient Attention for Face Anti-Spoofing

Static and Dynamic Fusion for Multi-modal Cross-ethnicity Face Anti-spoofing

Generalized Face Liveness Detection via De-fake Face Generator

Multi-Perspective Features Learning for Face Anti-Spoofing

Self-Attention and MLP Auxiliary Convolution for Face Anti-Spoofing

Dual-Cross Central Difference Network for Face Anti-Spoofing.

Reinforcing Face Anti-Spoofing with Multi-Scale Modality

Towards Data-Centric Face Anti-Spoofing: Improving Cross-domain Generalization via Physics-based Data Synthesis

Multi-Modal Face Anti-Spoofing Based on Central Difference Networks