SoK: Memorization in General-Purpose Large Language Models

Valentin Hartmann,Anshuman Suri,Vincent Bindschaedler,David Evans,Shruti Tople,Robert West
2023-10-24
Abstract:Large Language Models (LLMs) are advancing at a remarkable pace, with myriad applications under development. Unlike most earlier machine learning models, they are no longer built for one specific application but are designed to excel in a wide range of tasks. A major part of this success is due to their huge training datasets and the unprecedented number of model parameters, which allow them to memorize large amounts of information contained in the training data. This memorization goes beyond mere language, and encompasses information only present in a few documents. This is often desirable since it is necessary for performing tasks such as question answering, and therefore an important part of learning, but also brings a whole array of issues, from privacy and security to copyright and beyond. LLMs can memorize short secrets in the training data, but can also memorize concepts like facts or writing styles that can be expressed in text in many different ways. We propose a taxonomy for memorization in LLMs that covers verbatim text, facts, ideas and algorithms, writing styles, distributional properties, and alignment goals. We describe the implications of each type of memorization - both positive and negative - for model performance, privacy, security and confidentiality, copyright, and auditing, and ways to detect and prevent memorization. We further highlight the challenges that arise from the predominant way of defining memorization with respect to model behavior instead of model weights, due to LLM-specific phenomena such as reasoning capabilities or differences between decoding algorithms. Throughout the paper, we describe potential risks and opportunities arising from memorization in LLMs that we hope will motivate new research directions.
Computation and Language,Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The paper primarily explores the phenomenon of data memorization in large language models (LLMs) during training and attempts to address the following core issues: 1. **Memory Classification**: The paper proposes a classification system for LLMs' memory, covering different types of memory such as verbatim memory, facts, ideas and algorithms, writing styles, training data distribution characteristics, and alignment objectives. 2. **Impact Analysis**: For each type of memory, the paper discusses its impact on model performance, privacy, security, and copyright. For example, verbatim memory may be beneficial for certain tasks (such as quoting original text) but may also lead to the leakage of sensitive information; factual memory helps in answering questions but may also raise privacy concerns. 3. **Detection and Prevention Methods**: The paper provides specific strategies for detecting and preventing various types of memory. For instance, using specific metrics to measure whether the model has memorized certain content from the training data. 4. **Research Contributions**: The paper systematically organizes a large body of related literature and points out the challenges in existing research as well as future research directions, particularly in defining memory, measuring memory, and considering the impact of memory on different fields. In summary, this paper aims to comprehensively understand the various manifestations of the memory phenomenon in LLMs and its potential risks and opportunities, thereby providing researchers and practitioners with a more comprehensive understanding framework.