Abstract:Large language models (LLMs) have demonstrated remarkable prowess in language understanding and generation. Advancing from foundation LLMs to instructionfollowing LLMs, instruction tuning plays a vital role in aligning LLMs to human preferences. However, the existing LLMs are usually focused on English, leading to inferior performance in non-English languages. In order to improve the performance for non-English languages, it is necessary to collect language-specific training data for foundation LLMs and construct language-specific instructions for instruction tuning, both of which are heavy loads. To minimize human workload, we propose to transfer the capabilities of language generation and instruction following from English to other languages through an interactive translation task. We have developed BayLing, an instruction-following LLM by utilizing LLaMA as the foundation LLM and automatically constructing interactive translation instructions for instructing tuning. Extensive assessments demonstrate that BayLing achieves comparable performance to GPT-3.5-turbo, despite utilizing a considerably smaller parameter size of only 13 billion. Experimental results on translation tasks show that BayLing achieves 95% of single-turn translation capability compared to GPT-4 with automatic evaluation and 96% of interactive translation capability compared to GPT-3.5-turbo with human evaluation. To estimate the performance on general tasks, we created a multi-turn instruction test set called BayLing-80. The experimental results on BayLing-80 indicate that BayLing achieves 89% of performance compared to GPT-3.5-turbo. BayLing also demonstrates outstanding performance on knowledge assessment of Chinese GaoKao and English SAT, second only to GPT-3.5-turbo among a multitude of instruction-following LLMs. Demo, homepage, code and models of BayLing are available.

Native Language Identification with Large Language Models

Leveraging Open-Source Large Language Models for Native Language Identification

Applying Large Language Models for Automated Essay Scoring for Non-Native Japanese

Scaling Native Language Identification with Transformer Adapters

How do Large Language Models Handle Multilingualism?

Large Language Models as Neurolinguistic Subjects: Identifying Internal Representations for Form and Meaning

Large Language Models Meet NLP: A Survey

GPT-NER: Named Entity Recognition via Large Language Models

Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis

Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings

Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability

Generalizable clinical note section identification with large language models

Spoken Language Intelligence of Large Language Models for Language Learning

Can Large Language Models Identify Authorship?

Don't Trust ChatGPT when Your Question is not in English: A Study of Multilingual Abilities and Types of LLMs

Rethinking STS and NLI in Large Language Models

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Hum4n L4ngu4ge and the W0rld behind W0rds?

An Empirical Study on Information Extraction using Large Language Models