Open Conversational LLMs do not know most Spanish words

Javier Conde,Miguel González,Nina Melero,Raquel Ferrando,Gonzalo Martínez,Elena Merino-Gómez,José Alberto Hernández,Pedro Reviriego
DOI: https://doi.org/10.26342/2024-73-7
2024-09-24
Abstract:The growing interest in Large Language Models (LLMs) and in particular in conversational models with which users can interact has led to the development of a large number of open-source chat LLMs. These models are evaluated on a wide range of benchmarks to assess their capabilities in answering questions or solving problems on almost any possible topic or to test their ability to reason or interpret texts. Instead, the evaluation of the knowledge that these models have of the languages has received much less attention. For example, the words that they can recognize and use in different languages. In this paper, we evaluate the knowledge that open-source chat LLMs have of Spanish words by testing a sample of words in a reference dictionary. The results show that open-source chat LLMs produce incorrect meanings for an important fraction of the words and are not able to use most of the words correctly to write sentences with context. These results show how Spanish is left behind in the open-source LLM race and highlight the need to push for linguistic fairness in conversational LLMs ensuring that they provide similar performance across languages.
Computation and Language
What problem does this paper attempt to address?
### What Problem Does the Paper Attempt to Solve? The paper aims to evaluate the lexical knowledge of open dialogue large language models (LLMs) in Spanish. Specifically, the authors selected a series of open dialogue LLMs from different companies and organizations and assessed their vocabulary knowledge by testing their recognition and usage of Spanish words from a reference dictionary. #### Main Objectives: 1. **Evaluate the recognition and usage ability of Spanish words by open dialogue LLMs**: The study examines whether these models can correctly identify and use Spanish words to construct meaningful sentences. 2. **Provide an overview of the current state of Spanish vocabulary knowledge in open dialogue LLMs**: It offers a summary of the vocabulary knowledge of these models. 3. **Analyze the impact of model size on vocabulary knowledge**: The study explores the differences in vocabulary knowledge performance among models of different sizes. 4. **Compare the vocabulary knowledge between multilingual models and models focused on English and Chinese**: It contrasts the performance of models designed to support multiple languages with those optimized for one or two languages in terms of vocabulary knowledge. 5. **Analyze the effect of adapting pre-trained models to Spanish**: The study investigates the improvement in vocabulary knowledge after adjusting pre-trained models to enhance Spanish performance. ### Research Background: Most current research focuses on the performance of LLMs on various tasks and topics, but little attention is given to their lexical knowledge in different languages. This paper reveals through practical testing that existing open dialogue LLMs have significant deficiencies in handling Spanish vocabulary, indicating a need to promote language fairness and ensure these models perform similarly across different languages.