Abstract:Understanding the mechanisms of information storage and transfer in Transformer-based models is important for driving model understanding progress. Recent work has studied these mechanisms for Large Language Models (LLMs), revealing insights on how information is stored in a model's parameters and how information flows to and from these parameters in response to specific prompts. However, these studies have not yet been extended to Multi-modal Large Language Models (MLLMs). Given their expanding capabilities and real-world use, we start by studying one aspect of these models -- how MLLMs process information in a factual visual question answering task. We use a constraint-based formulation which views a visual question as having a set of visual or textual constraints that the model's generated answer must satisfy to be correct (e.g. What movie directed by the director in this photo has won a Golden Globe?). Under this setting, we contribute i) a method that extends causal information tracing from pure language to the multi-modal setting, and ii) VQA-Constraints, a test-bed of 9.7K visual questions annotated with constraints. We use these tools to study two open-source MLLMs, LLaVa and multi-modal Phi-2. Our key findings show that these MLLMs rely on MLP and self-attention blocks in much earlier layers for information storage, compared to LLMs whose mid-layer MLPs are more important. We also show that a consistent small subset of visual tokens output by the vision encoder are responsible for transferring information from the image to these causal blocks. We validate these mechanisms by introducing MultEdit, a model-editing algorithm that can correct errors and insert new long-tailed information into MLLMs by targeting these causal blocks.

Information Flow Routes: Automatically Interpreting Language Models at Scale

Routing in Sparsely-gated Language Models responds to Context

Trajeglish: Traffic Modeling as Next-Token Prediction

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

Unraveling Babel: Exploring Multilingual Activation Patterns of LLMs and Their Applications

Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows

Understanding Information Storage and Transfer in Multi-modal Large Language Models

Real-time Adapting Routing (RAR): Improving Efficiency Through Continuous Learning in Software Powered by Layered Foundation Models

Towards Explainable Traffic Flow Prediction with Large Language Models

Smoothie: Label Free Language Model Routing

Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models

EmbedLLM: Learning Compact Representations of Large Language Models

Permissive Information-Flow Analysis for Large Language Models

From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks

A Law of Next-Token Prediction in Large Language Models

Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings

Maintaining Informative Coherence: Migrating Hallucinations in Large Language Models via Absorbing Markov Chains

AI Flow at the Network Edge

LPNL: Scalable Link Prediction with Large Language Models

Uncovering Latent Chain of Thought Vectors in Language Models

Representations as Language: An Information-Theoretic Framework for Interpretability