Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin,Shuai Wang,Rizhao Cai,Yizhong Liu,Ying Fu,Zitong Yu,Wenzhong Tang,Alex Kot

2024-03-05

Abstract:Face Anti-Spoofing (FAS) is crucial for securing face recognition systems against presentation attacks. With advancements in sensor manufacture and multi-modal learning techniques, many multi-modal FAS approaches have emerged. However, they face challenges in generalizing to unseen attacks and deployment conditions. These challenges arise from (1) modality unreliability, where some modality sensors like depth and infrared undergo significant domain shifts in varying environments, leading to the spread of unreliable information during cross-modal feature fusion, and (2) modality imbalance, where training overly relies on a dominant modality hinders the convergence of others, reducing effectiveness against attack types that are indistinguishable sorely using the dominant modality. To address modality unreliability, we propose the Uncertainty-Guided Cross-Adapter (U-Adapter) to recognize unreliably detected regions within each modality and suppress the impact of unreliable regions on other modalities. For modality imbalance, we propose a Rebalanced Modality Gradient Modulation (ReGrad) strategy to rebalance the convergence speed of all modalities by adaptively adjusting their gradients. Besides, we provide the first large-scale benchmark for evaluating multi-modal FAS performance under domain generalization scenarios. Extensive experiments demonstrate that our method outperforms state-of-the-art methods. Source code and protocols will be released on

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

This paper focuses on the problem of multi-modal face anti-spoofing (FAS), which is an important technique to ensure the security of face recognition systems. Current multi-modal approaches perform poorly in dealing with unseen attacks and deployment environments, and there are two main challenges: modality unreliability and modality imbalance. 1. Modality Unreliability: In different environments, such as depth and infrared sensors, significant domain shifts may occur, resulting in unreliable extracted features that affect cross-modal fusion. 2. Modality Imbalance: Training overly relies on the dominant modality, hindering the convergence of other modalities and reducing resistance to attack types that are difficult to differentiate using only the dominant modality. To address these problems, the paper proposes a framework called Multi-Modal Domain Generalized (MMDG), which includes two key components: 1. Uncertainty-guided Cross-Adapter (U-Adapter): Utilizes the uncertainty of each modality to identify and suppress the influence of unreliable regions, preventing the propagation of unreliable information across modalities. 2. Rebalancing Modality Gradient (ReGrad) Strategy: Dynamically adjusts the gradients of all modalities to balance their convergence speed, ensuring that all modalities are fully utilized to resist various unseen attacks in the target domain. Furthermore, the paper creates the first large-scale benchmark to evaluate the performance of multi-modal FAS in domain-generalization scenarios. The experiments demonstrate that the proposed method outperforms the existing state-of-the-art approaches, and the code and protocol will be released on GitHub.

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Face Anti-Spoofing with Human Material Perception

Flexible-Modal Face Anti-Spoofing: A Benchmark

Multi-modal Face Anti-spoofing Using Multi-fusion Network and Global Depth-wise Convolution

Deep Learning for Face Anti-Spoofing: A Survey

Towards Data-Centric Face Anti-spoofing: Improving Cross-Domain Generalization via Physics-Based Data Synthesis

Multi-modal Face Anti-spoofing Based on a Single Image

Adversarial Learning and Decomposition-Based Domain Generalization for Face Anti-Spoofing

Self-Attention and MLP Auxiliary Convolution for Face Anti-Spoofing

Multi-modal Face Anti-spoofing Using Channel Cross Fusion Network and Global Depth-Wise Convolution.

Hyperbolic Face Anti-Spoofing

Towards Unified Representation of Invariant-Specific Features in Missing Modality Face Anti-spoofing

DiffFAS: Face Anti-Spoofing via Generative Diffusion Models

Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

M3FAS: An Accurate and Robust MultiModal Mobile Face Anti-Spoofing System

Static and Dynamic Fusion for Multi-modal Cross-ethnicity Face Anti-spoofing

Reinforcing Face Anti-Spoofing with Multi-Scale Modality

Uncertainty-Aware Physically-Guided Proxy Tasks for Unseen Domain Face Anti-spoofing

AdvFAS: A robust face anti-spoofing framework against adversarial examples

Benchmarking Joint Face Spoofing and Forgery Detection with Visual and Physiological Cues