Evaluating large language models for use in healthcare: A framework for translational value assessment

Sandeep Reddy
DOI: https://doi.org/10.1016/j.imu.2023.101304
2023-07-05
Informatics in Medicine Unlocked
Abstract:The recent focus on Large Language Models (LLMs) has yielded unprecedented discussion of their potential use in various domains, including healthcare. While showing considerable potential in performing human-capable tasks, LLMs have also demonstrated significant drawbacks, including generating misinformation, falsifying data, and contributing to plagiarism. These aspects are generally concerning but can be more severe in the context of healthcare. As LLMs are explored for utility in healthcare, including generating discharge summaries, interpreting medical records and providing medical advice, it is necessary to ensure safeguards around their use in healthcare. Notably, there must be an evaluation process that assesses LLMs for their natural language processing performance and their translational value. Complementing this assessment, a governance layer can ensure accountability and public confidence in such models. Such an evaluation framework is discussed and presented in this paper.
What problem does this paper attempt to address?