Unlearnable Algorithms for In-context Learning

Andrei Muresanu,Anvith Thudi,Michael R. Zhang,Nicolas Papernot
2024-02-02
Abstract:Machine unlearning is a desirable operation as models get increasingly deployed on data with unknown provenance. However, achieving exact unlearning -- obtaining a model that matches the model distribution when the data to be forgotten was never used -- is challenging or inefficient, often requiring significant retraining. In this paper, we focus on efficient unlearning methods for the task adaptation phase of a pretrained large language model (LLM). We observe that an LLM's ability to do in-context learning for task adaptation allows for efficient exact unlearning of task adaptation training data. We provide an algorithm for selecting few-shot training examples to prepend to the prompt given to an LLM (for task adaptation), ERASE, whose unlearning operation cost is independent of model and dataset size, meaning it scales to large models and datasets. We additionally compare our approach to fine-tuning approaches and discuss the trade-offs between the two approaches. This leads us to propose a new holistic measure of unlearning cost which accounts for varying inference costs, and conclude that in-context learning can often be more favourable than fine-tuning for deployments involving unlearning requests.
Machine Learning,Artificial Intelligence,Cryptography and Security
What problem does this paper attempt to address?
This paper discusses the problem of machine unlearning, especially the efficient and precise unlearning method in the task adaptation stage of large-scale language models (LLMs). Machine unlearning aims to adjust the model to perform as if it has never used a part of the original training data when it needs to remove some data from the original training set after model deployment. Current precise unlearning methods are costly or inefficient for deep neural networks (DNNs) and often require a large amount of retraining. The paper observes that LLMs use two-stage learning: unsupervised task-agnostic learning and task adaptation for specific downstream tasks. In the second stage, efficient and precise unlearning of the task adaptation training data can be achieved through "prompts" for context learning. The paper proposes an algorithm called ERASE, which selects a small number of training examples as prompts with unlearning operation costs independent of the model and dataset size, suitable for large-scale models and datasets. ERASE is compared with fine-tuning methods, and the trade-offs between them are discussed. This leads to the proposal of a new comprehensive measure of unlearning costs that considers different inference costs. The study found that context learning may be more advantageous than fine-tuning in certain cases for deployments involving unlearning requests. In addition, the paper points out that the focus on unlearning operations overlooks the additional deployment costs of changing training algorithms to achieve efficient unlearning, especially the increase in inference costs when transitioning from fine-tuning to context learning. Therefore, a more comprehensive measure of unlearning costs is proposed, which is the number of inferences required per unlearning request to achieve the same cost as retraining with stochastic gradient descent (or similar) before each unlearning request. In summary, the main contributions of the paper include: 1. The first exploration of precise unlearning using context learning and identification of the trade-offs between inference costs and unlearning costs in deep learning. A new measure of unlearning costs is proposed. 2. The proposal of ERASE, a precise unlearning algorithm based on quantized k-means, with unlearning operation costs independent of the dataset size and performance comparable to an effective context learning example selection baseline. 3. The demonstration that context methods may be more efficient in unlearning while achieving performance comparable to fine-tuning algorithms, indicating the potential advantages of context learning in domains that require precise unlearning.