BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records

Weimin Lyu,Zexin Bi,Fusheng Wang,Chao Chen
2024-07-07
Abstract:The advent of clinical language models integrated into electronic health records (EHR) for clinical decision support has marked a significant advancement, leveraging the depth of clinical notes for improved decision-making. Despite their success, the potential vulnerabilities of these models remain largely unexplored. This paper delves into the realm of backdoor attacks on clinical language models, introducing an innovative attention-based backdoor attack method, BadCLM (Bad Clinical Language Models). This technique clandestinely embeds a backdoor within the models, causing them to produce incorrect predictions when a pre-defined trigger is present in inputs, while functioning accurately otherwise. We demonstrate the efficacy of BadCLM through an in-hospital mortality prediction task with MIMIC III dataset, showcasing its potential to compromise model integrity. Our findings illuminate a significant security risk in clinical decision support systems and pave the way for future endeavors in fortifying clinical language models against such vulnerabilities.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the vulnerability of backdoor attacks in clinical language models in electronic health records (EHR). Although clinical language models have made remarkable progress in improving medical decision - making support, the security of these models, especially their vulnerability to backdoor attacks, has not been fully explored. By introducing a backdoor attack method based on the attention mechanism - BadCLM (Bad Clinical Language Models), the paper discusses how to embed backdoors in clinical language models, so that the models make wrong predictions when encountering predefined triggers and work normally in other cases. The research demonstrates the effectiveness of BadCLM by using the MIMIC - III dataset for in - hospital mortality prediction tasks and reveals the potential threat of this attack to the integrity of clinical decision - support systems. This finding emphasizes the importance and urgency of strengthening the security of clinical language models.