TechGPT-2.0: A large language model project to solve the task of knowledge graph construction

Jiaqi Wang,Yuying Chang,Zhong Li,Ning An,Qi Ma,Lei Hei,Haibo Luo,Yifei Lu,Feiliang Ren

2024-01-09

Abstract:Large language models have exhibited robust performance across diverse natural language processing tasks. This report introduces TechGPT-2.0, a project designed to enhance the capabilities of large language models specifically in knowledge graph construction tasks, including named entity recognition (NER) and relationship triple extraction (RTE) tasks in NLP applications. Additionally, it serves as a LLM accessible for research within the Chinese open-source model community. We offer two 7B large language model weights and a QLoRA weight specialized for processing lengthy texts.Notably, TechGPT-2.0 is trained on Huawei's Ascend server. Inheriting all functionalities from TechGPT-1.0, it exhibits robust text processing capabilities, particularly in the domains of medicine and law. Furthermore, we introduce new capabilities to the model, enabling it to process texts in various domains such as geographical areas, transportation, organizations, literary works, biology, natural sciences, astronomical objects, and architecture. These enhancements also fortified the model's adeptness in handling hallucinations, unanswerable queries, and lengthy texts. This report provides a comprehensive and detailed introduction to the full fine-tuning process on Huawei's Ascend servers, encompassing experiences in Ascend server debugging, instruction fine-tuning data processing, and model training. Our code is available at

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The main goal of this paper is to enhance the capabilities of large language models in knowledge graph construction tasks through the TechGPT-2.0 project, particularly in natural language processing applications such as Named Entity Recognition (NER) and Relation Triplet Extraction (RTE). Additionally, the project aims to develop a large language model available for the Chinese open-source community. Specifically: - **Enhancing Capabilities**: Improve the model's performance in NER and RTE tasks, especially in fields like medicine and law. - **Dataset Construction**: Build datasets that include NER and RTE sub-tasks, ensuring the quality and diversity of the datasets. - **Long Text Processing**: Introduce model weights optimized specifically for long text processing (e.g., QLoRA) to enhance the model's ability to handle long texts. - **Technical Sharing**: Provide a detailed account of the experience using Huawei Ascend servers for model training, including the debugging process, data preprocessing methods, and training techniques, to serve as a reference for other researchers. In summary, TechGPT-2.0 aims to improve the performance of large language models in the field of knowledge graph construction by refining model architecture and training methods, and to share practical experiences to promote the development of related research.

TechGPT-2.0: A large language model project to solve the task of knowledge graph construction

TableGPT2: A Large Multimodal Model with Tabular Data Integration

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

AcademicGPT: Empowering Academic Research

DB-GPT: Empowering Database Interactions with Private Large Language Models

Qwen2.5 Technical Report

ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation

GraphGPT: Graph Instruction Tuning for Large Language Models

Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models

RestGPT: Connecting Large Language Models with Real-World RESTful APIs

DoctorGPT: A Large Language Model with Chinese Medical Question-Answering Capabilities

LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

Large Language Models are Complex Table Parsers

Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs

Large language models in biomedical natural language processing: benchmarks, baselines, and recommendations

DB-GPT: Large Language Model Meets Database

Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation

Qwen Technical Report

KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases

TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT