Abstract:Knowledge graphs (KGs) have been successfully applied to the analysis of complex scientific and technological domains, with automatic KG generation methods typically building upon relation extraction models capturing fine-grained relations between domain entities in text. While these relations are fully applicable across scientific areas, existing models are trained on few domain-specific datasets such as SciERC and do not perform well on new target domains. In this paper, we experiment with leveraging in-context learning capabilities of Large Language Models to perform schema-constrained data annotation, collecting in-domain training instances for a Transformer-based relation extraction model deployed on titles and abstracts of research papers in the Architecture, Construction, Engineering and Operations (AECO) domain. By assessing the performance gain with respect to a baseline Deep Learning architecture trained on off-domain data, we show that by using a few-shot learning strategy with structured prompts and only minimal expert annotation the presented approach can potentially support domain adaptation of a science KG generation model.

What problem does this paper attempt to address?

The paper aims to address the issue of cross-domain adaptation in the field of Relation Extraction (RE). Specifically, existing relation extraction models are typically trained on datasets from specific domains (e.g., SciERC), and these models perform poorly in new target domains (such as the Architecture, Engineering, Construction, and Operations (AECO) domain). Manually annotating training data for new domains is both time-consuming and costly, and it is difficult to scale. To solve this problem, the paper proposes a method that leverages the capabilities of Large Language Models (LLMs) for contextual learning. This is achieved by generating data annotations that conform to pattern constraints through structured prompts, thereby collecting the training instances needed to train Transformer-based relation extraction models. This method requires only a small number of manually designed sentence annotation examples and explicit task instructions. The experimental section demonstrates that this method improves performance on AECO domain research paper titles and abstracts compared to traditional baseline models that only use cross-domain data for training. Although the quality of LLM-generated data is not yet sufficient to completely replace manually annotated data, combining it with high-quality manually annotated data can significantly enhance model performance in new domains. This approach provides a more cost-effective strategy for optimizing local small-scale models rather than directly relying on large language models for inference. Future work will include expanding the test set, considering other domains, and attempting to use more advanced LLM models to generate high-quality synthetic data.

A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models

Transfer Learning for Relation Extraction Via Relation-Gated Adversarial Learning

How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?

Semantics-Guided Knowledge Integration for Domain Adaptation Few-shot Relation Extraction

Linguistic representations for fewer-shot relation extraction across domains

Empowering Few-Shot Relation Extraction with The Integration of Traditional RE Methods and Large Language Models

Label-Free Distant Supervision for Relation Extraction via Knowledge Graph Embedding.

Learning Discriminative Semantic and Multi-view Context for Domain Adaptive Few-Shot Relation Extraction.

Adapting Distilled Knowledge for Few-shot Relation Reasoning over Knowledge Graphs

Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types

Dynamic Few-Shot Learning for Knowledge Graph Question Answering

Graph Adaptation Network with Domain-Specific Word Alignment for Cross-Domain Relation Extraction

Meta-Learning Based Dynamic Adaptive Relation Learning for Few-Shot Knowledge Graph Completion

Relation Extraction with Fine-Tuned Large Language Models in Retrieval Augmented Generation Frameworks

Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction

Relation Extraction Via Domain-aware Transfer Learning

Context-Aware Adapter Tuning for Few-Shot Relation Learning in Knowledge Graphs

Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models

Leveraging Knowledge Graph Embeddings to Enhance Contextual Representations for Relation Extraction

Combining large language models with enterprise knowledge graphs: a perspective on enhanced natural language understanding

LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition