Abstract:Pretrained language models (PLMs) have demonstrated strong performance on many natural language processing (NLP) tasks. Despite their great success, these PLMs are typically pretrained only on unstructured free texts without leveraging existing structured knowledge bases that are readily available for many domains, especially scientific domains. As a result, these PLMs may not achieve satisfactory performance on knowledge-intensive tasks such as biomedical NLP. Comprehending a complex biomedical document without domain-specific knowledge is challenging, even for humans. Inspired by this observation, we propose a general framework for incorporating various types of domain knowledge from multiple sources into biomedical PLMs. We encode domain knowledge using lightweight adapter modules, bottleneck feed-forward networks that are inserted into different locations of a backbone PLM. For each knowledge source of interest, we pretrain an adapter module to capture the knowledge in a self-supervised way. We design a wide range of self-supervised objectives to accommodate diverse types of knowledge, ranging from entity relations to description sentences. Once a set of pretrained adapters is available, we employ fusion layers to combine the knowledge encoded within these adapters for downstream tasks. Each fusion layer is a parameterized mixer of the available trained adapters that can identify and activate the most useful adapters for a given input. Our method diverges from prior work by including a knowledge consolidation phase, during which we teach the fusion layers to effectively combine knowledge from both the original PLM and newly-acquired external knowledge using a large collection of unannotated texts. After the consolidation phase, the complete knowledge-enhanced model can be fine-tuned for any downstream task of interest to achieve optimal performance. Extensive experiments on many biomedical NLP datasets show that our proposed framework consistently improves the performance of the underlying PLMs on various downstream tasks such as natural language inference, question answering, and entity linking. These results demonstrate the benefits of using multiple sources of external knowledge to enhance PLMs and the effectiveness of the framework for incorporating knowledge into PLMs. While primarily focused on the biomedical domain in this work, our framework is highly adaptable and can be easily applied to other domains, such as the bioenergy sector.

Ensemble pretrained language models to extract biomedical knowledge from literature

Abstract 5366: Constructing the largest-scale biomedical knowledge graph using all PubMed articles and its application in automated knowledge discovery

Pre-trained models, data augmentation, and ensemble learning for biomedical information extraction and document classification

Advancing entity recognition in biomedicine via instruction tuning of large language models

Improving Biomedical Pretrained Language Models with Knowledge

Abstract 5365: Extracting novel knowledge from scientific literature to build a web portal for cancer researchers to keep up with the latest scientific discoveries

AI-assisted Knowledge Discovery in Biomedical Literature to Support Decision-making in Precision Oncology

Abstract 5421: Constructing the largest-scale knowledge graph using all PubMed abstracts and its application for highly specific and accurate knowledge retrieval

Ensemble of Deep Masked Language Models for Effective Named Entity Recognition in Health and Life Science Corpora

BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition

Biomedical named entity recognition using BERT in the machine reading comprehension framework

KEBLM: Knowledge-Enhanced Biomedical Language Models

Biomedical Named Entity Recognition at Scale

Named Entity Recognition in Chinese Medical Literature Using Pretraining Models

High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models

A systematic evaluation of large language models for biomedical natural language processing: benchmarks, baselines, and recommendations

Evaluation of a prototype machine learning tool to semi-automate data extraction for systematic literature reviews

A pre-training and self-training approach for biomedical named entity recognition

Leveraging pre-trained language models for mining microbiome-disease relationships

UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition

Intent Detection and Entity Extraction from BioMedical Literature