Safety of Multimodal Large Language Models on Images and Texts

Xin Liu,Yichen Zhu,Yunshi Lan,Chao Yang,Yu Qiao

2024-06-20

Abstract:Attracted by the impressive power of Multimodal Large Language Models (MLLMs), the public is increasingly utilizing them to improve the efficiency of daily work. Nonetheless, the vulnerabilities of MLLMs to unsafe instructions bring huge safety risks when these models are deployed in real-world scenarios. In this paper, we systematically survey current efforts on the evaluation, attack, and defense of MLLMs' safety on images and text. We begin with introducing the overview of MLLMs on images and text and understanding of safety, which helps researchers know the detailed scope of our survey. Then, we review the evaluation datasets and metrics for measuring the safety of MLLMs. Next, we comprehensively present attack and defense techniques related to MLLMs' safety. Finally, we analyze several unsolved issues and discuss promising research directions. The latest papers are continually collected at <a class="link-external link-https" href="https://github.com/isXinLiu/MLLM-Safety-Collection" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

This paper focuses on the security issues of Multi-modal Large Language Models (MLLMs) in terms of images and texts. With the widespread application of MLLMs in improving work efficiency, they also bring significant security risks due to their vulnerability to unsafe instructions. Researchers have done a lot of work on the security of single-modal language models, but the research on the security of MLLMs is still in its infancy. The paper systematically investigates methods for evaluating, attacking, and defending the security of MLLMs. First, it introduces an overview of MLLMs on images and texts, as well as the understanding of security. Then, it reviews evaluation datasets and metrics used to measure model security. Afterwards, it elaborates on the attack and defense techniques related to MLLM security. Finally, it analyzes the existing unresolved issues and discusses future research directions. The risks mentioned in the paper mainly include three aspects: the adversarial perturbations of images can induce insecure results at a low cost; alignment-based LLMs usually reject malicious textual instructions, but they may directly follow corresponding visual instructions when utilizing built-in Optical Character Recognition (OCR) capabilities; cross-modal training can weaken alignment ability. To promote progress in this field, the paper provides a comprehensive summary of MLLM security, including evaluation, attack, and defense perspectives. In summary, this paper aims to address how to ensure the secure behavior of Multi-modal Large Language Models when handling image and text inputs, as well as how to evaluate, prevent, and cope with potential unsafe factors.

Safety of Multimodal Large Language Models on Images and Texts

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security

SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models

Multimodal Situational Safety

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

Query-Relevant Images Jailbreak Large Multi-Modal Models

Seeing is Deceiving: Exploitation of Visual Pathways in Multi-Modal Language Models

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Surveying the MLLM Landscape: A Meta-Review of Current Surveys

Recent Advances in Attack and Defense Approaches of Large Language Models

A Survey on Safe Multi-Modal Learning System

Exploring Advanced Methodologies in Security Evaluation for LLMs

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

Uncovering Safety Risks of Large Language Models through Concept Activation Vector

A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

Highlighting the Safety Concerns of Deploying LLMs/VLMs in Robotics

A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly