Purifying Large Language Models by Ensembling a Small Language Model

Tianlin Li,Qian Liu,Tianyu Pang,Chao Du,Qing Guo,Yang Liu,Min Lin
DOI: https://doi.org/10.48550/arxiv.2402.14845
2024-01-01
Abstract:The emerging success of large language models (LLMs) heavily relies oncollecting abundant training data from external (untrusted) sources. Despitesubstantial efforts devoted to data cleaning and curation, well-constructedLLMs have been reported to suffer from copyright infringement, data poisoning,and/or privacy violations, which would impede practical deployment of LLMs. Inthis study, we propose a simple and easily implementable method for purifyingLLMs from the negative effects caused by uncurated data, namely, throughensembling LLMs with benign and small language models (SLMs). Aside fromtheoretical guarantees, we perform comprehensive experiments to empiricallyconfirm the efficacy of ensembling LLMs with SLMs, which can effectivelypreserve the performance of LLMs while mitigating issues such as copyrightinfringement, data poisoning, and privacy violations.
What problem does this paper attempt to address?