Abstract:Background: Biomedical relation extraction (RE) is of great importance for researchers to conduct systematic biomedical studies. It not only helps knowledge mining, such as knowledge graphs and novel knowledge discovery, but also promotes translational applications, such as clinical diagnosis, decision-making, and precision medicine. However, the relations between biomedical entities are complex and diverse, and comprehensive biomedical RE is not yet well established. Objective: We aimed to investigate and improve large-scale RE with diverse relation types and conduct usability studies with application scenarios to optimize biomedical text mining. Methods: Data sets containing 125 relation types with different entity semantic levels were constructed to evaluate the impact of entity semantic information on RE, and performance analysis was conducted on different model architectures and domain models. This study also proposed a continued pretraining strategy and integrated models with scripts into a tool. Furthermore, this study applied RE to the COVID-19 corpus with article topics and application scenarios of clinical interest to assess and demonstrate its biological interpretability and usability. Results: The performance analysis revealed that RE achieves the best performance when the detailed semantic type is provided. For a single model, PubMedBERT with continued pretraining performed the best, with an F1-score of 0.8998. Usability studies on COVID-19 demonstrated the interpretability and usability of RE, and a relation graph database was constructed, which was used to reveal existing and novel drug paths with edge explanations. The models (including pretrained and fine-tuned models), integrated tool (Docker), and generated data (including the COVID-19 relation graph database and drug paths) have been made publicly available to the biomedical text mining community and clinical researchers. Conclusions: This study provided a comprehensive analysis of RE with diverse relation types. Optimized RE models and tools for diverse relation types were developed, which can be widely used in biomedical text mining. Our usability studies provided a proof-of-concept demonstration of how large-scale RE can be leveraged to facilitate novel research.

PLRTE: Progressive learning for biomedical relation triplet extraction using large language models

High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models

A general approach for improving deep learning-based medical relation extraction using a pre-trained model and fine-tuning

Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction

Relation Extraction from Biomedical and Clinical Text: Unified Multitask Learning Framework

Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models

Integrating deep learning architectures for enhanced biomedical relation extraction: a pipeline approach

BOUNDARY LAYER EFFECTS OF INI-ERLAMINAR STRESSES ADJACENT TO A HOLE IN A LAMINATED COMPOSITE PLATE

Large-Scale Biomedical Relation Extraction Across Diverse Relation Types: Model Development and Usability Study on COVID-19

Decomposing Relational Triple Extraction with Large Language Models for Better Generalization on Unseen Data

Biomedical document relation extraction with prompt learning and KNN

A Unified Active Learning Framework for Biomedical Relation Extraction.

Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction

Improving Relation Extraction by Knowledge Representation Learning

Relation Extraction in underexplored biomedical domains: A diversity-optimised sampling and synthetic data generation approach

BioBERT-based Deep Learning and Merged ChemProt-DrugProt for Enhanced Biomedical Relation Extraction

Benchingmaking Large Langage Models in Biomedical Triple Extraction

Energetics of temperature regulation and foraging in a bumblebee,Bombus terricola kirby

Learning Relation Prototype from Unlabeled Texts for Long-tail Relation Extraction.

Experiments on transfer learning architectures for biomedical relation extraction