Benchingmaking Large Langage Models in Biomedical Triple Extraction

Mingchen Li,Huixue Zhou,Rui Zhang
2024-04-27
Abstract:Biomedical triple extraction systems aim to automatically extract biomedical entities and relations between entities. The exploration of applying large language models (LLM) to triple extraction is still relatively unexplored. In this work, we mainly focus on sentence-level biomedical triple extraction. Furthermore, the absence of a high-quality biomedical triple extraction dataset impedes the progress in developing robust triple extraction systems. To address these challenges, initially, we compare the performance of various large language models. Additionally, we present GIT, an expert-annotated biomedical triple extraction dataset that covers a wider range of relation types.
Computation and Language
What problem does this paper attempt to address?
The paper primarily aims to address two key issues: 1. **Evaluating the performance of large language models in biomedical triple extraction tasks**: Although there has been a lot of research in the field of triple extraction, studies utilizing large language models (LLMs) for such tasks are relatively scarce. Therefore, the authors focus on evaluating the performance of several different large language models in the task of biomedical triple extraction. 2. **Developing a high-quality biomedical triple extraction dataset**: The currently available datasets have insufficient coverage in terms of relationship types, limiting the progress of researchers in developing robust and generalizable triple extraction systems. To address this challenge, the authors created a new dataset named GIT, which includes a wide range of relationship types and is annotated by experts to improve data quality and diversity. Through these efforts, the authors hope to advance the research on automatic extraction of entities and their relationships in biomedical texts, which is crucial for building knowledge graphs and supporting downstream applications such as drug repositioning and question-answering systems.