A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models

Vanni Zavarella,Juan Carlos Gamero-Salinas,Sergio Consoli
2024-08-05
Abstract:Knowledge graphs (KGs) have been successfully applied to the analysis of complex scientific and technological domains, with automatic KG generation methods typically building upon relation extraction models capturing fine-grained relations between domain entities in text. While these relations are fully applicable across scientific areas, existing models are trained on few domain-specific datasets such as SciERC and do not perform well on new target domains. In this paper, we experiment with leveraging in-context learning capabilities of Large Language Models to perform schema-constrained data annotation, collecting in-domain training instances for a Transformer-based relation extraction model deployed on titles and abstracts of research papers in the Architecture, Construction, Engineering and Operations (AECO) domain. By assessing the performance gain with respect to a baseline Deep Learning architecture trained on off-domain data, we show that by using a few-shot learning strategy with structured prompts and only minimal expert annotation the presented approach can potentially support domain adaptation of a science KG generation model.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of cross-domain adaptation in the field of Relation Extraction (RE). Specifically, existing relation extraction models are typically trained on datasets from specific domains (e.g., SciERC), and these models perform poorly in new target domains (such as the Architecture, Engineering, Construction, and Operations (AECO) domain). Manually annotating training data for new domains is both time-consuming and costly, and it is difficult to scale. To solve this problem, the paper proposes a method that leverages the capabilities of Large Language Models (LLMs) for contextual learning. This is achieved by generating data annotations that conform to pattern constraints through structured prompts, thereby collecting the training instances needed to train Transformer-based relation extraction models. This method requires only a small number of manually designed sentence annotation examples and explicit task instructions. The experimental section demonstrates that this method improves performance on AECO domain research paper titles and abstracts compared to traditional baseline models that only use cross-domain data for training. Although the quality of LLM-generated data is not yet sufficient to completely replace manually annotated data, combining it with high-quality manually annotated data can significantly enhance model performance in new domains. This approach provides a more cost-effective strategy for optimizing local small-scale models rather than directly relying on large language models for inference. Future work will include expanding the test set, considering other domains, and attempting to use more advanced LLM models to generate high-quality synthetic data.