Medical mT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain

Iker García-Ferrero,Rodrigo Agerri,Aitziber Atutxa Salazar,Elena Cabrio,Iker de la Iglesia,Alberto Lavelli,Bernardo Magnini,Benjamin Molinet,Johana Ramirez-Romero,German Rigau,Jose Maria Villa-Gonzalez,Serena Villata,Andrea Zaninello
2024-04-11
Abstract:Research on language technology for the development of medical applications is currently a hot topic in Natural Language Understanding and Generation. Thus, a number of large language models (LLMs) have recently been adapted to the medical domain, so that they can be used as a tool for mediating in human-AI interaction. While these LLMs display competitive performance on automated medical texts benchmarks, they have been pre-trained and evaluated with a focus on a single language (English mostly). This is particularly true of text-to-text models, which typically require large amounts of domain-specific pre-training data, often not easily accessible for many languages. In this paper, we address these shortcomings by compiling, to the best of our knowledge, the largest multilingual corpus for the medical domain in four languages, namely English, French, Italian and Spanish. This new corpus has been used to train Medical mT5, the first open-source text-to-text multilingual model for the medical domain. Additionally, we present two new evaluation benchmarks for all four languages with the aim of facilitating multilingual research in this domain. A comprehensive evaluation shows that Medical mT5 outperforms both encoders and similarly sized text-to-text models for the Spanish, French, and Italian benchmarks, while being competitive with current state-of-the-art LLMs in English.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the following key issues: 1. **Lack of high-quality multilingual benchmarks in the medical field**: Existing large language models (LLMs) perform well on medical text tasks but are mostly focused on English. There is a lack of high-quality multilingual benchmarks for other languages, especially Spanish, French, and Italian. 2. **Insufficient multilingual text-to-text models adapted to the medical field**: Although there are various encoder models for the medical field (such as SciBERT, BioBERT, etc.), there is still a gap in multilingual text-to-text models in the medical domain. These models usually require a large amount of domain-specific data for pre-training, which is not easy to obtain for many languages. To address the above issues, the research team constructed the largest multilingual medical corpus to date and trained Medical mT5—the first open-source multilingual medical text-to-text model based on it. Additionally, they created two new multilingual benchmarks for sequence labeling (argument component detection) and generative question answering tasks to promote multilingual research in this field. Experimental results show that Medical mT5 performs excellently in benchmark tests for Spanish, French, and Italian, and its performance on English tasks is comparable to the current state-of-the-art models.