Learning from Negative Samples in Generative Biomedical Entity Linking

Chanhwi Kim,Hyunjae Kim,Sihyeon Park,Jiwoo Lee,Mujeen Sung,Jaewoo Kang
2024-08-29
Abstract:Generative models have become widely used in biomedical entity linking (BioEL) due to their excellent performance and efficient memory usage. However, these models are usually trained only with positive samples--entities that match the input mention's identifier--and do not explicitly learn from hard negative samples, which are entities that look similar but have different meanings. To address this limitation, we introduce ANGEL (Learning from Negative Samples in Generative Biomedical Entity Linking), the first framework that trains generative BioEL models using negative samples. Specifically, a generative model is initially trained to generate positive samples from the knowledge base for given input entities. Subsequently, both correct and incorrect outputs are gathered from the model's top-k predictions. The model is then updated to prioritize the correct predictions through direct preference optimization. Our models fine-tuned with ANGEL outperform the previous best baseline models by up to an average top-1 accuracy of 1.4% on five benchmarks. When incorporating our framework into pre-training, the performance improvement further increases to 1.7%, demonstrating its effectiveness in both the pre-training and fine-tuning stages. Our code is available at <a class="link-external link-https" href="https://github.com/dmis-lab/ANGEL" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
The paper primarily focuses on addressing a key issue in Biomedical Entity Linking (BioEL): how to leverage negative samples to enhance the performance of generative models. Specifically, existing generative models are typically trained only with positive samples, i.e., those entities that match the input entity identifiers, without explicitly learning from negative samples that look similar but have different meanings. This limitation makes it difficult for the models to distinguish between biomedical entities that have similar surface forms but different actual meanings. To address this issue, the authors propose the ANGEL framework, which is the first method to train generative biomedical entity linking models using negative samples. ANGEL achieves this through two stages: first, it performs warm-up training using only positive samples, and then it trains with both positive and negative samples to enhance the model's generalization ability. In this way, ANGEL not only improves the accuracy of generative models but also significantly outperforms previous best baseline models on five benchmark datasets, with an average accuracy improvement of 1.7%. Moreover, the ANGEL framework performs well in both pre-training and fine-tuning stages and is generalizable to different underlying language models, consistently improving the performance of various models.