Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers

Haowen Pan,Yixin Cao,Xiaozhi Wang,Xun Yang,Meng Wang

2024-06-11

Abstract:Understanding the internal mechanisms by which multi-modal large language models (LLMs) interpret different modalities and integrate cross-modal representations is becoming increasingly critical for continuous improvements in both academia and industry. In this paper, we propose a novel method to identify key neurons for interpretability -- how multi-modal LLMs bridge visual and textual concepts for captioning. Our method improves conventional works upon efficiency and applied range by removing needs of costly gradient computation. Based on those identified neurons, we further design a multi-modal knowledge editing method, beneficial to mitigate sensitive words or hallucination. For rationale of our design, we provide theoretical assumption. For empirical evaluation, we have conducted extensive quantitative and qualitative experiments. The results not only validate the effectiveness of our methods, but also offer insightful findings that highlight three key properties of multi-modal neurons: sensitivity, specificity and causal-effect, to shed light for future research.

Computation and Language

What problem does this paper attempt to address?

The paper proposes a new approach to identify and manipulate multimodal neurons that play a crucial role in pretrained Transformer-based multimodal language models. The researchers found that multimodal neurons are essential for understanding images and generating textual descriptions, but their identification process is inefficient and their applicability is limited. To address this issue, they define a contribution score based on activation outputs to determine the extent to which the neurons contribute to specific concepts. This method improves efficiency as it does not require gradient calculation. Based on the identified neurons, the paper also presents a multimodal knowledge editing approach that allows for editing specific concepts in the model parameters without retraining the entire model. The main contributions of the paper are as follows: 1. Introducing a new approach to identify multimodal neurons in Transformers. 2. Designing a multimodal knowledge editing method based on these neurons to control model outputs. 3. Experimentally revealing three critical properties of multimodal neurons: sensitivity, specificity, and causality effect, and designing corresponding evaluation metrics. In the experiments, the researchers used several widely-used visual semantic understanding models and conducted experiments on the SBU Captions dataset to validate the effectiveness of the proposed methods. The results demonstrate that their approach can accurately identify neurons associated with semantic concepts and these neurons exhibit invariance across different regions and images, indicating their sensitivity and specificity to specific concepts.

Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers

Visualizing and Understanding Neural Models in NLP

Multimodal Neurons in Pretrained Text-Only Transformers

MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models

MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model

Explaining Multi-modal Large Language Models by Analyzing their Vision Perception

Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training

LXMERT: Learning Cross-Modality Encoder Representations from Transformers

Brain encoding models based on multimodal transformers can transfer across language and vision

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Training Transitive and Commutative Multimodal Transformers with LoReTTa

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Understanding Information Storage and Transfer in Multi-modal Large Language Models

Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

A Multimodal Visual Encoding Model Aided by Introducing Verbal Semantic Information

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning