Appropriateness of Answers to Common Preanesthesia Patient Questions Composed by the Large Language Model GPT-4 Compared to Human Authors

Scott Segal,Amit K. Saha,Ashish K. Khanna
DOI: https://doi.org/10.1097/aln.0000000000004824
IF: 8.986
2024-01-17
Anesthesiology
Abstract:Many surgical patients will not interact with anesthesiologists until minutes before surgery, and the internet has become a common source of medical information. The use of large language models such as GPT-4, which are "generative artificial intelligence" tools capable of creating natural, human-sounding prose in response to a plain-language query, and their incorporation into search engines, promise to make it easier for patients to directly ask questions related to preanesthetic preparation. The accuracy of large language models in answering medical questions has generally been impressive 1–3 but has not been evaluated for preanesthetic queries. We evaluated the ability of the widely accessible model GPT-4 to provide reasonable responses to common preanesthetic patient questions, compared to online published resources. Our hypothesis was that GPT-4 was at least as reasonable as published resources.
anesthesiology
What problem does this paper attempt to address?