AI for Biomedicine in the Era of Large Language Models

Zhenyu Bi,Sajib Acharjee Dip,Daniel Hajialigol,Sindhura Kommu,Hanwen Liu,Meng Lu,Xuan Wang

2024-03-23

Abstract:The capabilities of AI for biomedicine span a wide spectrum, from the atomic level, where it solves partial differential equations for quantum systems, to the molecular level, predicting chemical or protein structures, and further extending to societal predictions like infectious disease outbreaks. Recent advancements in large language models, exemplified by models like ChatGPT, have showcased significant prowess in natural language tasks, such as translating languages, constructing chatbots, and answering questions. When we consider biomedical data, we observe a resemblance to natural language in terms of sequences: biomedical literature and health records presented as text, biological sequences or sequencing data arranged in sequences, or sensor data like brain signals as time series. The question arises: Can we harness the potential of recent large language models to drive biomedical knowledge discoveries? In this survey, we will explore the application of large language models to three crucial categories of biomedical data: 1) textual data, 2) biological sequences, and 3) brain signals. Furthermore, we will delve into large language model challenges in biomedical research, including ensuring trustworthiness, achieving personalization, and adapting to multi-modal data representation

Computation and Language

What problem does this paper attempt to address?

This paper discusses how to apply large language models (LLMs) such as ChatGPT to the field of biomedicine to promote knowledge discovery. The authors point out that biomedical data shares similarities with natural language in terms of sequencing, including textual data (such as medical literature and health records), biological sequences (such as DNA, RNA, and proteins), and brain signals (time series data). The paper aims to explore how LLMs can be used to process these three types of biomedical data and the challenges faced in their application, such as credibility, personalization, and adaptation to multimodal data. The paper provides a detailed introduction to various pre-training models for biomedical textual data, such as SciBERT, ClinicalBERT, BioBERT, BioMegatron, SciFive, PubMedBERT, BioLinkBERT, Galactica, BioGPT, DoT5, GatorTronGPT, and Med-PaLM 2, and discusses their potential in clinical applications (such as clinical treatment planning, report generation, and multi-agent collaboration) and research applications (such as information extraction and question-answering systems). Furthermore, the paper focuses on the application of LLMs in the analysis of biological sequences, especially DNA, RNA, protein, and multi-omics sequencing data. It mentions the importance of models such as Enformer, Nucleotide Transformer, GenSLMs, DNABERT, GENA-LM, and HyenaDNA in gene expression prediction, virus evolution analysis, transcription factor binding site identification, and other areas. In summary, the paper seeks to address the effective utilization of large language models to drive knowledge discovery in the field of biomedicine. By understanding and applying these models, the paper aims to overcome the complexity and diversity of data in order to improve diagnostic accuracy, disease prediction, and personalized healthcare.

AI for Biomedicine in the Era of Large Language Models

A Survey for Large Language Models in Biomedicine

Large language models in bioinformatics: applications and perspectives

A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine

An Evaluation of Large Language Models in Bioinformatics Research

Large AI Models in Health Informatics: Applications, Challenges, and the Future

Large language models in health care: Development, applications, and challenges

Artificial intelligence: revolutionizing cardiology with large language models

Perspectives on the application of large language models in healthcare

Demystifying Large Language Models for Medicine: A Primer

Healthcare: A Growing Role for Large Language Models and Generative AI

Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health

Large language models reshaping molecular biology and drug development

A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions

The future landscape of large language models in medicine

Based on Medicine, The Now and Future of Large Language Models

Large language models for biomedicine: foundations, opportunities, challenges, and best practices

Large Language Models in Medicine: The Potentials and Pitfalls

Large Language Models for Scientific Synthesis, Inference and Explanation

The application of large language models in medicine: A scoping review