What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to study the impact of quantization on the ability of multi - modal vision - language foundation models (ViL models) to produce socially fair outputs. Specifically, the author explores the following issues: 1. **The impact of quantization on model fairness**: - Quantization is a common method for compressing deep - learning models. By converting model parameters from 32 - bit floating - point numbers to low - bit integers (such as 8 - bit or 4 - bit), the memory footprint and inference latency of the model can be significantly reduced. - However, this conversion may introduce small numerical perturbations, which can lead to changes in model behavior, including changes in bias towards different social groups. 2. **Whether there is a consistent bias amplification phenomenon**: - Previous studies have shown that in unimodal models (such as pure vision or pure language models), compression usually amplifies social bias. However, for multi - modal vision - language models, it is not clear whether this effect is consistent. - By conducting extensive evaluations on four quantization settings, three datasets, and three CLIP variants, the author found that quantization does not consistently change the magnitude or direction of bias in all compressed models. ### Main contributions - **Filling the knowledge gap**: This is the first work to systematically study the impact of quantization on the fairness of multi - modal vision - language models, filling an important gap in existing research. - **Complex and context - dependent results**: Different from previous unimodal model studies, the author found that the impact of quantization on the bias of multi - modal models is not consistent, indicating that the impact of compression techniques on fairness may be more complex and context - dependent. - **Challenging existing assumptions**: These findings challenge the assumption that "quantization will consistently amplify bias", suggesting that we need to understand more precisely how compression techniques affect fairness in different architectures and applications. ### Method overview - **Quantization methods**: The author used three common quantization methods: 8 - bit and 4 - bit quantization from HuggingFace and 8 - bit dynamic quantization from PyTorch. - **Evaluation metrics**: The accuracy and fairness of the model were evaluated through zero - shot image classification, text - image retrieval tasks, and FACET and FairFace datasets. - **Experimental settings**: Different variants of the CLIP model were selected and experiments were carried out on multiple training data sources, covering a total of 32 different scenarios. ### Conclusion The author's research reveals that the impact of quantization on the bias of multi - modal vision - language models is neither consistent nor uniform, and its direction and magnitude vary depending on the model, method, and dataset. This indicates that the impact of quantization on fairness is complex and context - dependent, challenges the assumption that quantization will consistently affect bias, and emphasizes the need for further research.

You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models

Compressed Models Decompress Race Biases: What Quantized Models Forget for Fair Face Recognition

When Quantization Affects Confidence of Large Language Models?

Contrastive Quant: Quantization Makes Stronger Contrastive Learning

Quantized Prompt for Efficient Generalization of Vision-Language Models

A Multidimensional Analysis of Social Biases in Vision Transformers

Contrastive Quant

Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models

Quantization Effects on Neural Networks Perception: How would quantization change the perceptual field of vision models?

Foundations of Large Language Model Compression -- Part 1: Weight Quantization

Seeing through bag-of-visual-word glasses: towards understanding quantization effects in feature extraction methods

Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models

Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation

Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective

Survey of Social Bias in Vision-Language Models

QuIP: 2-Bit Quantization of Large Language Models With Guarantees

How Does Quantization Affect Multilingual LLMs?

Saliency Assisted Quantization for Neural Networks

NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers

Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners

Mixed Non-linear Quantization for Vision Transformers