Improving readability and comprehension levels of otolaryngology patient education materials using ChatGPT

Allison D Oliva,Luke J Pasick,Michael E Hoffer,David E Rosow
DOI: https://doi.org/10.1016/j.amjoto.2024.104502
2024-08-26
Abstract:Objective: A publicly available large language learning model platform may help determine current readability levels of otolaryngology patient education materials, as well as translate these materials to the recommended 6th-grade and 8th-grade reading levels. Study design: Cross-sectional analysis. Setting: Online using large language learning model, ChatGPT. Methods: The Patient Education pages of the American Laryngological Association (ALA) and American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS) websites were accessed. Materials were input into ChatGPT (OpenAI, San Francisco, CA; version 3.5) and Microsoft Word (Microsoft, Redmond, WA; version 16.74). Programs calculated Flesch Reading Ease (FRE) scores, with higher scores indicating easier readability, and Flesch-Kincaid (FK) grade levels, estimating U.S. grade level required to understand text. ChatGPT was prompted to "translate to a 5th-grade reading level" and provide new scores. Scores were compared for statistical differences, as well as differences between ChatGPT and Word gradings. Results: Patient education materials were reviewed and 37 ALA and 72 AAO-HNS topics were translated. Overall FRE scores and FK grades demonstrated significant improvements following translation of materials, as scored by ChatGPT (p < 0.001). Word also scored significant improvements in FRE and FK following translation by ChatGPT for AAO-HNS materials overall (p < 0.001) but not for individual topics or for subspecialty-specific categories. Compared with Word, ChatGPT significantly exaggerated the change in FRE grades and FK scores (p < 0.001). Conclusion: Otolaryngology patient education materials were found to be written at higher reading levels than recommended. Artificial intelligence may prove to be a useful resource to simplify content to make it more accessible to patients.
What problem does this paper attempt to address?