How do Large Language Models Handle Multilingualism?

Yiran Zhao,Wenxuan Zhang,Guizhen Chen,Kenji Kawaguchi,Lidong Bing
2024-05-24
Abstract:Large language models (LLMs) have demonstrated impressive capabilities across diverse languages. This study explores how LLMs handle multilingualism. Based on observed language ratio shifts among layers and the relationships between network structures and certain capabilities, we hypothesize the LLM's multilingual workflow ($\texttt{MWork}$): LLMs initially understand the query, converting multilingual inputs into English for task-solving. In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures, respectively. In the final layers, LLMs generate responses aligned with the original language of the query. To verify $\texttt{MWork}$, we introduce Parallel Language-specific Neuron Detection ($\texttt{PLND}$) to identify activated neurons for inputs in different languages without any labeled data. Using $\texttt{PLND}$, we validate $\texttt{MWork}$ through extensive experiments involving the deactivation of language-specific neurons across various layers and structures. Moreover, $\texttt{MWork}$ allows fine-tuning of language-specific neurons with a small dataset, enhancing multilingual abilities in a specific language without compromising others. This approach results in an average improvement of $3.6\%$ for high-resource languages and $2.3\%$ for low-resource languages across all tasks with just $400$ documents.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How do large - language models (LLMs) process multilingual information? Specifically, the paper explores the working mechanisms of LLMs when handling multilingual tasks, especially how they understand, solve problems and generate responses, and how these processes are transferred among different languages. Through this research, the author hopes to reveal the internal working processes of LLMs in multilingual processing and propose an effective method to enhance the capabilities of specific languages without affecting the performance of other languages. The paper proposes a hypothesized multilingual workflow (Multilingual Workflow, MWork), which is divided into three stages: 1. **Understanding**: LLMs first understand the original non - English queries and interpret them into English. 2. **Task Solving**: At the intermediate layer, LLMs think in English and obtain factual content by combining multilingual knowledge through self - attention structures and feed - forward network structures respectively. 3. **Generation**: At the final layer, LLMs generate responses consistent with the original query language. To verify MWork, the author introduces a new method - Parallel Language - specific Neuron Detection (PLND), which is used to identify neurons activated by different language inputs without any labeled data. Through PLND, the author verifies MWork and shows that by selectively deactivating neurons of specific languages, the performance of the model on corresponding tasks can be significantly reduced, thus proving that these neurons are crucial for handling tasks of specific languages. In addition, the paper also explores how to enhance the multilingual capabilities of LLMs by fine - tuning neurons of specific languages. This method only requires a small amount of data (several hundred documents) and can significantly improve the performance of high - resource languages and low - resource languages, with an average increase of 3.6% and 2.3% respectively. This enhancement method does not affect the performance of other languages, thus achieving targeted improvements for specific languages.