CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

Ajian Liu,Shuai Xue,Jianwen Gan,Jun Wan,Yanyan Liang,Jiankang Deng,Sergio Escalera,Zhen Lei

2024-03-21

Abstract:Domain generalization (DG) based Face Anti-Spoofing (FAS) aims to improve the model's performance on unseen domains. Existing methods either rely on domain labels to align domain-invariant feature spaces, or disentangle generalizable features from the whole sample, which inevitably lead to the distortion of semantic feature structures and achieve limited generalization. In this work, we make use of large-scale VLMs like CLIP and leverage the textual feature to dynamically adjust the classifier's weights for exploring generalizable visual features. Specifically, we propose a novel Class Free Prompt Learning (CFPL) paradigm for DG FAS, which utilizes two lightweight transformers, namely Content Q-Former (CQF) and Style Q-Former (SQF), to learn the different semantic prompts conditioned on content and style features by using a set of learnable query vectors, respectively. Thus, the generalizable prompt can be learned by two improvements: (1) A Prompt-Text Matched (PTM) supervision is introduced to ensure CQF learns visual representation that is most informative of the content description. (2) A Diversified Style Prompt (DSP) technology is proposed to diversify the learning of style prompts by mixing feature statistics between instance-specific styles. Finally, the learned text features modulate visual features to generalization through the designed Prompt Modulation (PM). Extensive experiments show that the CFPL is effective and outperforms the state-of-the-art methods on several cross-domain datasets.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address the issue of Face Anti-Spoofing (FAS) in cross-domain scenarios. Specifically, existing methods typically rely on domain labels to align invariant feature spaces or to extract generalizable features from samples when dealing with distribution differences between different domains. These methods inevitably lead to the distortion of semantic feature structures and have limited generalization capabilities. The paper proposes a new method—Class Free Prompt Learning (CFPL), which leverages the text features of large-scale vision-language models (such as CLIP) to dynamically adjust classifier weights in order to explore generalizable visual features. The specific approach is as follows: 1. **Content and Style Prompt Learning**: Learning different semantic prompts through two lightweight Transformers (Content Q-Former and Style Q-Former). 2. **Prompt-Text Matching Supervision**: Ensuring that content prompts can extract the most relevant visual representations to the content description. 3. **Diversified Style Prompt Techniques**: Diversifying style prompts by mixing instance-specific style feature statistics. 4. **Prompt Modulation Function**: Modulating visual features with learned text features through a designed modulation function to achieve generalization. Through the above methods, the paper demonstrates the effectiveness of CFPL on multiple cross-domain datasets and significantly outperforms existing state-of-the-art methods on several metrics.

CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-spoofing

Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less

Instance-Aware Domain Generalization for Face Anti-Spoofing

Harmonizing Generalization and Personalization in Federated Prompt Learning

Deep Learning for Face Anti-Spoofing: A Survey

Visual Prompt Flexible-Modal Face Anti-Spoofing

CSDG-FAS: Closed-Space Domain Generalization for Face Anti-spoofing

Adaptive Normalized Representation Learning for Generalizable Face Anti-Spoofing

MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection

CG-FAS: Cross-label Generative Augmentation for Face Anti-Spoofing

Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing

Fourier-Based Frequency Space Disentanglement and Augmentation for Generalizable Face Anti-Spoofing

Towards Data-Centric Face Anti-Spoofing: Improving Cross-domain Generalization via Physics-based Data Synthesis

Test-Time Domain Generalization for Face Anti-Spoofing

Towards Unsupervised Domain Generalization for Face Anti-Spoofing

Visual Prompt Based Personalized Federated Learning

TeG-DG: Textually Guided Domain Generalization for Face Anti-Spoofing.

CA-MoEiT: Generalizable Face Anti-spoofing via Dual Cross-Attention and Semi-fixed Mixture-of-Expert

Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing