Comment on: Assessing the Quality and Readability of Online Patient Information: ENT UK Patient Information e-Leaflets vs Responses by a Generative Artificial Intelligence

Hinpetch Daungsupawong,Viroj Wiwanitkit
DOI: https://doi.org/10.1055/s-0044-1791697
2024-10-13
Facial Plastic Surgery
Abstract:Shamil E, Ko TK, Fan KS, et al. Assessing the quality and readability of online patient information: ENT UK patient information eleaflets vs responses by a generative artificial intelligence. Facial Plast Surg 2024 (e-pub ahead of print). doi: 10.1055/a-2413-3675. Dear Editor, We would like to comment on the publication by Shamil et al. This study compared the quality and readability of digitally generated health information, specifically ChatGPT, to patient information leaflets created by professionals from the ENT UK using the Ensuring Quality Information for Patients (EQIP) tool for quality assessment and the Flesch–Kincaid Grade Level (FKGL) for readability assessment. The study gathered and examined leaflets from both sources. The results showed that the ENT UK leaflets were of moderate quality, whereas the ChatGPT responses were consistent in quality but less readable. Raters with varying degrees of medical knowledge demonstrated disparities in quality rating, with nonspecialist doctors scoring the highest and medical students scoring the lowest. Overall, the ChatGPT findings revealed comparable content quality, although the readability was worse than those of expert booklets. This study has numerous flaws and limitations. First, the sample size of five ENT UK booklets may be insufficient to draw firm conclusions regarding the overall quality of ChatGPT-generated health information on a wide range of medical topics. A bigger sample size may yield more detailed insights into the trends and consistency in information quality. Furthermore, while the EQIP and FKGL instruments are valuable for assessment, they may not capture all of the information important to patient comprehension. This is especially true for different communities with varying levels of health literacy. This limits our understanding of how the findings apply to real-world patient scenarios. The approach utilized to evaluate participants may generate bias. Raters with various medical knowledge may provide contradictory evaluations to pamphlets, especially when different levels of competence influence perception. This discrepancy is exacerbated by a lack of uniform rating training or anonymization, making it difficult to establish the extent of rater bias in the data. Finally, the mechanism of transmission of health information, such as the format and accessibility of pamphlets versus artificial intelligence (AI) generated responses, was not taken into account, which may have skewed the readability assessments. Future studies should look at a broader range of topics and use larger sample sizes to reach more significant findings about digital health information versus professionally developed health information. Furthermore, using a more refined range of raters, such as a mix of health experts, laypeople, and patients, may aid in better understanding of health information across communities. Examining the effects of various formats on digital health information, such as visual media or multimedia components, may provide more thorough insights for enhancing patient reading and comprehension. This study provides a preliminary methodology for comparing AI-generated health information to professionally established data sources to constructively expand on the findings. Future research could include interactive components in assessments, such as user engagement measurements (e.g., time spent on information and retention rates) and real-world patient comments on usability. Additionally, incorporating machine learning techniques to increase the accessibility and usefulness of AI-generated content could result in better patient education resources tailored to individual requirements. This combination of technology and patient education creates new prospects for improving health care communication in an increasingly digital society. H.D. contributed to 50% ideas, writing, analyzing, and approval. V.W. contributed to 50% ideas, supervision, and approval. The authors used artificial intelligence for language editing of the article. Article published online: 11 October 2024 © 2024. Thieme. All rights reserved. Thieme Medical Publishers, Inc. 333 Seventh Avenue, 18th Floor, New York, NY 10001, USA
surgery
What problem does this paper attempt to address?