Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources

Issey Sukeda
2024-09-20
Abstract:The recent success of large language models (LLMs) and the scaling law has led to a widespread adoption of larger models. Particularly in the healthcare industry, there is an increasing demand for locally operated LLMs due to security concerns. However, the majority of high quality open-source LLMs have a size of 70B parameters, imposing significant financial burdens on users for GPU preparation and operation. To overcome these issues, we present a medical adaptation based on the recent 7B models, which enables the operation in low computational resources. We compare the performance on medical question-answering benchmarks in two languages (Japanese and English), demonstrating that its scores reach parity with or surpass those of currently existing medical LLMs that are ten times larger. We find that fine-tuning an English-centric base model on Japanese medical dataset improves the score in both language, supporting the effect of cross-lingual knowledge transfer. We hope that this study will alleviate financial challenges, serving as a stepping stone for clinical institutions to practically utilize LLMs locally. Our evaluation code is available at <a class="link-external link-https" href="https://github.com/stardust-coder/japanese-lm-med-harness" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to develop a large - language model (LLM) in the Japanese medical field that can operate efficiently and perform well under limited computing resources. Specifically, the paper focuses on the following aspects: 1. **Reducing computing costs**: Currently, most high - quality open - source large - language models have 70 billion parameters, which brings a huge financial burden to users, especially in terms of GPU preparation and operation. Therefore, researchers hope to use models with a smaller number of parameters (about 7 billion) to reduce the demand for computing resources. 2. **Improving the security of local deployment**: In the medical industry, security issues are particularly important due to the involvement of patients' personal privacy. Large - language models are usually only accessible through API services, which limits their practical application in the clinical environment. Researchers hope to improve data security and model customization by developing miniaturized local models. 3. **Achieving cross - language knowledge transfer**: Researchers hope to verify the effect of cross - language knowledge transfer by fine - tuning an English - centered base model on a Japanese medical data set, that is, to improve the performance of the model on Japanese medical tasks without sacrificing English performance. 4. **Evaluating model performance**: The paper verifies the effectiveness of the proposed method by comparing the performance of the model in medical question - answering benchmark tests in two languages (Japanese and English). Researchers hope to prove that, through appropriate fine - tuning, small - scale models can reach or exceed the performance of existing large - scale medical LLMs. In summary, this paper aims to explore how to develop a miniaturized large - language model in the Japanese medical field that can ensure performance and improve security under limited computing resources.