Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench

Zheyuan Liu,Guangyao Dou,Mengzhao Jia,Zhaoxuan Tan,Qingkai Zeng,Yongle Yuan,Meng Jiang
2024-10-29
Abstract:Generative models such as Large Language Models (LLM) and Multimodal Large Language models (MLLMs) trained on massive web corpora can memorize and disclose individuals' confidential and private data, raising legal and ethical concerns. While many previous works have addressed this issue in LLM via machine unlearning, it remains largely unexplored for MLLMs. To tackle this challenge, we introduce Multimodal Large Language Model Unlearning Benchmark (MLLMU-Bench), a novel benchmark aimed at advancing the understanding of multimodal machine unlearning. MLLMU-Bench consists of 500 fictitious profiles and 153 profiles for public celebrities, each profile feature over 14 customized question-answer pairs, evaluated from both multimodal (image+text) and unimodal (text) perspectives. The benchmark is divided into four sets to assess unlearning algorithms in terms of efficacy, generalizability, and model utility. Finally, we provide baseline results using existing generative model unlearning algorithms. Surprisingly, our experiments show that unimodal unlearning algorithms excel in generation and cloze tasks, while multimodal unlearning approaches perform better in classification tasks with multimodal inputs.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to achieve privacy - protected machine unlearning in multimodal large language models (MLLMs), that is, to make the model "forget" specific data points without completely retraining the model, while maintaining the performance and generalization ability of the model**. ### Background and Motivation With the development of large language models (LLMs) and multimodal large language models (MLLMs), these models are pre - trained on large - scale web corpora and can remember and leak individuals' sensitive and private data, which has raised legal and ethical concerns. Although much previous work has addressed this issue in LLMs through machine unlearning techniques, this area is still in the exploration stage for MLLMs. This is because the knowledge in multimodal models is inter - related across different modalities (such as text and image), and simply forgetting text information is not sufficient to completely remove the relevant knowledge in the model. ### Solution To solve this problem, the authors propose the **Multimodal Large Language Model Unlearning Benchmark (MLLMU - Bench)**, a new benchmarking tool designed to evaluate and promote the understanding of multimodal machine unlearning. MLLMU - Bench contains the profiles of 500 fictional characters and 153 public celebrities, each profile contains 14 customized question - answer pairs, and is evaluated from both multimodal (image + text) and unimodal (text) perspectives. The benchmark is divided into four datasets to evaluate the effectiveness, generalization ability and model utility of the unlearning algorithms. ### Main Contributions 1. **Proposing MLLMU - Bench**: A privacy - protected benchmark for evaluating the unlearning ability of multimodal large language models, with a focus on maintaining model utility while removing private knowledge. 2. **Comprehensive Evaluation**: MLLMU - Bench provides a comprehensive evaluation of unlearning algorithms in multimodal and unimodal settings, highlighting the focus of each setting and the impact of the interaction between modalities on unlearning performance. 3. **Experimental Results**: Through extensive experiments on four baseline methods and one prompting technique, it provides insights into the trade - off between unlearning effectiveness and model utility, especially the impact on the general capabilities of MLLMs. ### Experimental Results The experimental results show that unimodal unlearning algorithms perform better in generation and cloze tasks, while multimodal unlearning algorithms perform better in classification tasks. In addition, the experiment also found that there is a trade - off between unlearning effectiveness and model utility, including performance on retained samples, neighboring concepts and model generalization ability. ### Conclusion Through MLLMU - Bench, researchers can better understand and evaluate privacy - protection issues in multimodal large language models and promote further development in this field.