SWEA: Changing Factual Knowledge in Large Language Models Via Subject Word Embedding Altering

Xiaopeng Li,Shasha Li,Shezheng Song,Huijun Liu,Bin Ji,Xi Wang,Jun Ma,Jie Yu,Xiaodong Liu,Jing Wang,Weimin Zhang
DOI: https://doi.org/10.48550/arxiv.2401.17809
2024-01-01
Abstract:The general capabilities of large language models (LLMs) make them theinfrastructure for various AI applications, but updating their inner knowledgerequires significant resources. Recent model editing is a promising techniquefor efficiently updating a small amount of knowledge of LLMs and has attractedmuch attention. In particular, local editing methods, which directly updatemodel parameters, are more suitable for updating a small amount of knowledge.Local editing methods update weights by computing least squares closed-formsolutions and identify edited knowledge by vector-level matching in inference,which achieve promising results. However, these methods still require a lot oftime and resources to complete the computation. Moreover, vector-level matchinglacks reliability, and such updates disrupt the original organization of themodel's parameters. To address these issues, we propose an detachable andexpandable Subject Word Embedding Altering (SWEA) framework, which finds theediting embeddings through token-level matching and adds them to the subjectword embeddings in Transformer input. To get these editing embeddings, wepropose optimizing then suppressing fusion method, which first optimizeslearnable embedding vectors for the editing target and then suppresses theKnowledge Embedding Dimensions (KEDs) to obtain final editing embeddings. Wethus propose SWEA⊕OS method for editing factual knowledge in LLMs. Wedemonstrate the overall state-of-the-art (SOTA) performance of SWEA⊕OSon the CounterFact and zsRE datasets. To further validate thereasoning ability of SWEA⊕OS in editing knowledge, we evaluate it on themore complex RippleEdits benchmark. The results demonstrate thatSWEA⊕OS possesses SOTA reasoning ability.
What problem does this paper attempt to address?