GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

Ihor Stepanov,Mykhailo Shtopko

2024-08-01

Abstract:Information extraction tasks require both accurate, efficient, and generalisable models. Classical supervised deep learning approaches can achieve the required performance, but they need large datasets and are limited in their ability to adapt to different tasks. On the other hand, large language models (LLMs) demonstrate good generalization, meaning that they can adapt to many different tasks based on user requests. However, LLMs are computationally expensive and tend to fail to generate structured outputs. In this article, we will introduce a new kind of GLiNER model that can be used for various information extraction tasks while being a small encoder model. Our model achieved SoTA performance on zero-shot NER benchmarks and leading performance on question-answering, summarization and relation extraction tasks. Additionally, in this article, we will cover experimental results on self-learning approaches for named entity recognition using GLiNER models.

Machine Learning,Artificial Intelligence,Computation and Language,Information Retrieval

What problem does this paper attempt to address?

The paper primarily aims to address several key issues in the Information Extraction (IE) task, including: 1. **Efficiency**: Developing models that can handle large amounts of unstructured data while conserving computational resources. 2. **Accuracy**: Ensuring high accuracy in fields with low error tolerance, such as biomedical or commercial domains. 3. **Generalization Ability**: Enabling the model to easily adapt to new tasks and domains. To tackle these issues, the authors propose a new model—GLiNER Multi-Task, a lightweight and versatile model suitable for various information extraction tasks. This model is based on the GLiNER architecture and uses DeBERTa v3 as the base encoder. Compared to other methods, it has the following features: - **Efficient and Controllable**: More efficient and easier to control compared to large language models (LLMs). - **Structured Output**: Generates more structured output compared to generative models. - **Zero-shot Learning**: Achieves state-of-the-art performance in tasks like zero-shot named entity recognition (Zero-shot NER). - **Multi-task Learning**: Excels in tasks such as question answering, summarization, and relation extraction. The study also explores the impact of self-training techniques on the model's performance and demonstrates the model's performance across different tasks through experiments. Additionally, the model is trained using synthetic datasets to enhance its generalization ability. Overall, this research provides an efficient, accurate, and flexible new solution for information extraction tasks.

GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models

GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction

Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!

A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Large Language Models for Generative Information Extraction: A Survey

An Empirical Study on Information Extraction using Large Language Models

Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction

Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction.

GPT-NER: Named Entity Recognition via Large Language Models

Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks

Benchmarking Large Language Models with Augmented Instructions for Fine-grained Information Extraction

Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

Supervised Knowledge Makes Large Language Models Better In-context Learners

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons