SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

Tianyu Yu,Chengyue Jiang,Chao Lou,Shen Huang,Xiaobin Wang,Wei Liu,Jiong Cai,Yangning Li,Yinghui Li,Kewei Tu,Hai-Tao Zheng,Ningyu Zhang,Pengjun Xie,Fei Huang,Yong Jiang
2023-08-21
Abstract:Large language models (LLMs) have shown impressive ability for open-domain NLP tasks. However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format. Their performances on NLU tasks are highly related to prompts or demonstrations and are shown to be poor at performing several representative NLU tasks, such as event extraction and entity typing. To this end, we present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding. We express all NLU tasks with two atomic tasks, which define fixed instructions to restrict the input and output format but still ``open'' for arbitrarily varied label sets. The model is first instruction-tuned with extremely fine-grained labeled data synthesized by ChatGPT and then further fine-tuned by 233 different atomic tasks from 152 datasets across various domains. The experimental results show that SeqGPT has decent classification and extraction ability, and is capable of performing language understanding tasks on unseen domains. We also conduct empirical studies on the scaling of data and model size as well as on the transfer across tasks. Our model is accessible at <a class="link-external link-https" href="https://github.com/Alibaba-NLP/SeqGPT" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
This paper aims to address the limitations of large language models (LLMs) in natural language understanding (NLU) tasks. Although LLMs perform excellently in open-domain natural language processing tasks, they perform poorly in NLU tasks with specific input-output formats, especially in tasks such as event extraction and entity type recognition. To this end, the research team proposed SeqGPT, a bilingual (English and Chinese) open-source autoregressive model specifically enhanced for open-domain natural language understanding. SeqGPT expresses all NLU tasks as two atomic tasks (extraction and classification), defining fixed instructions to constrain input and output formats while still adapting to various label sets. Additionally, the model is first instruction-tuned with fine-grained annotated data generated by ChatGPT, and then further fine-tuned with 233 different atomic tasks from 152 different domains. Experimental results show that SeqGPT has good classification and extraction capabilities in unseen domains and can perform language understanding tasks.