Evaluation of the quality and readability of ChatGPT responses to frequently asked questions about myopia in traditional Chinese language

Li-Chun Chang,Chi-Chin Sun,Ting-Han Chen,Der-Chong Tsai,Hui-Ling Lin,Li-Ling Liao
DOI: https://doi.org/10.1177/20552076241277021
2024-09-02
Abstract:Introduction: ChatGPT can serve as an adjunct informational tool for ophthalmologists and their patients. However, the reliability and readability of its responses to myopia-related queries in the Chinese language remain underexplored. Purpose: This study aimed to evaluate the ability of ChatGPT to address frequently asked questions (FAQs) about myopia by parents and caregivers. Method: Myopia-related FAQs were input three times into fresh ChatGPT sessions, and the responses were evaluated by 10 ophthalmologists using a Likert scale for appropriateness, usability, and clarity. The Chinese Readability Index Explorer (CRIE) was used to evaluate the readability of each response. Inter-rater reliability among the reviewers was examined using Cohen's kappa coefficient, and Spearman's rank correlation analysis and one-way analysis of variance were used to investigate the relationship between CRIE scores and each criterion. Results: Forty-five percent of the responses of ChatGPT in Chinese language were appropriate and usable and only 35% met all the set criteria. The CRIE scores for 20 ChatGPT responses ranged from 7.29 to 12.09, indicating that the readability level was equivalent to a middle-to-high school level. Responses about the treatment efficacy and side effects were deficient for all three criteria. Conclusions: The performance of ChatGPT in addressing pediatric myopia-related questions is currently suboptimal. As parents increasingly utilize digital resources to obtain health information, it has become crucial for eye care professionals to familiarize themselves with artificial intelligence-driven information on pediatric myopia.
What problem does this paper attempt to address?