A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine

Hanguang Xiao,Feizhong Zhou,Xingyue Liu,Tianqi Liu,Zhipeng Li,Xin Liu,Xiaoxuan Huang
2024-05-14
Abstract:Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal large language models (MLLMs) have garnered significant attention due to their powerful and general capabilities in understanding, reasoning, and generation, thereby offering new paradigms for the integration of artificial intelligence with medicine. This survey comprehensively overviews the development background and principles of LLMs and MLLMs, as well as explores their application scenarios, challenges, and future directions in medicine. Specifically, this survey begins by focusing on the paradigm shift, tracing the evolution from traditional models to LLMs and MLLMs, summarizing the model structures to provide detailed foundational knowledge. Subsequently, the survey details the entire process from constructing and evaluating to using LLMs and MLLMs with a clear logic. Following this, to emphasize the significant value of LLMs and MLLMs in healthcare, we survey and summarize 6 promising applications in healthcare. Finally, the survey discusses the challenges faced by medical LLMs and MLLMs and proposes a feasible approach and direction for the subsequent integration of artificial intelligence with medicine. Thus, this survey aims to provide researchers with a valuable and comprehensive reference guide from the perspectives of the background, principles, and clinical applications of LLMs and MLLMs.
Computation and Language
What problem does this paper attempt to address?
This paper is a comprehensive survey focusing on the applications of large language models (LLMs) and multimodal large language models (MLLMs) in the medical field. With the release of models like ChatGPT and GPT-4, these models have opened up new avenues for the integration of artificial intelligence and medicine, thanks to their strong capabilities in understanding and generating text. The paper outlines the background and principles of LLMs and MLLMs, and explores their application scenarios, challenges, and future directions in the healthcare domain. First, the paper introduces the paradigm shift from traditional models to LLMs and MLLMs, summarizing the foundational knowledge of model structures. It then elaborates on the entire process of building, evaluating, and using these models. To emphasize the importance of these models in healthcare, the paper investigates and summarizes six promising application areas. Next, the paper discusses the challenges faced by LLMs and MLLMs in the medical field, and proposes feasible approaches and directions for the integration of artificial intelligence and medicine in the future. The aim is to provide researchers with a comprehensive reference guide on the background, principles, and clinical applications of LLMs and MLLMs. The paper also covers four stages of development in the NLP field, from supervised learning to unsupervised pre-training and fine-tuning, and from unsupervised pre-training and prompting to text-to-multimodal. Additionally, it emphasizes the impact of high-quality data on the performance of LLMs and MLLMs, and predicts that data engineering will be a key focus of future research. In summary, this paper aims to address the potential of leveraging and understanding large language models and multimodal language models in the medical field, as well as how to overcome related challenges to promote the application of artificial intelligence in healthcare.