(021) ChatGPT's Ability to Assess Quality and Readability of Online Medical Information

R Golan,SJ Ripps,R Raghuram,J Loloi,A Bernstein,ZM Connelly,NS Golan,R Ramasamy
DOI: https://doi.org/10.1093/jsxmed/qdae001.019
2024-02-07
The Journal of Sexual Medicine
Abstract:Introduction Health literacy plays a crucial role in enabling patients to understand and effectively use medical information. As technology rapidly advances, the significance of health literacy becomes even more pronounced, particularly in comprehending complex medical information. Artificial Intelligence (AI) platforms have garnered significant attention for their remarkable ability to generate automated responses to a wide range of prompts. However, their capacity to assess the quality and readability of provided text remains uncertain. Given the growing prominence of AI web assistant tools, we hypothesized that integrating these tools into patients' web searches could enhance the retrieval of accurate medical information. Objective To evaluate the proficiency of Conversational Generative Pre-Trained Transformer (ChatGPT) in assessing readability, and utilizing the DISCERN tool to assess quality of online content regarding shock wave therapy for erectile dysfunction. Methods Websites were generated using a Google search of "shock wave therapy for erectile dysfunction" with location filters disabled. Readability was analyzed using the Readable software (Readable.com, Horsham, United Kingdom). Quality was assessed independently by three reviewers using the DISCERN tool. The same plain text files collected were inputted into ChatGPT to determine whether it produced comparable metrics for readability and quality. Results The study results revealed a notable disparity between ChatGPT's readability assessment and that obtained from a reliable tool, Readable.com (p<0.05). This indicates a lack of alignment between ChatGPT's algorithm and that of established tools, such as Readable.com. Similarly, the DISCERN score generated by ChatGPT differed significantly from the scores generated manually by human evaluators (p<0.05), suggesting that ChatGPT may not be capable of accurately identifying poor-quality information sources regarding shock wave therapy as a treatment for erectile dysfunction. Conclusions ChatGPT's evaluation of the quality and readability of online text regarding shockwave therapy for erectile dysfunction differs from that of human raters and trusted tools. ChatGPT's current capabilities were not sufficient for reliably assessing the quality and readability of textual content. Further research is needed to elucidate the role of AI in the objective evaluation of online medical content in other fields. Continued development in AI and incorporation of tools such as DISCERN into AI software may enhance the way patients navigate the web in search of high-quality medical content in the future. Disclosure No.
urology & nephrology
What problem does this paper attempt to address?
This paper aims to evaluate the performance of Conversational Generative Pre - Trained Transformer (ChatGPT) in assessing the quality and readability of online medical information. Specifically, the researchers are concerned with whether ChatGPT can effectively use the DISCERN tool to evaluate the quality of online content regarding extracorporeal shock wave therapy for erectile dysfunction, as well as its accuracy in readability assessment. ### Research Background Health literacy plays a crucial role in patients' understanding and effective use of medical information. With the rapid development of technology, especially in understanding complex medical information, the importance of health literacy has become increasingly prominent. Artificial intelligence (AI) platforms have received extensive attention due to their ability to generate automated responses. However, the ability of these platforms to assess the quality and readability of the provided text is not clear. Given the increasing prominence of AI - based web - assistive tools, the researchers hypothesized that integrating these tools into patients' web searches could improve the ability to obtain accurate medical information. ### Research Objectives - Evaluate the accuracy of ChatGPT in assessing the readability of online content. - Use the DISCERN tool to evaluate the performance of ChatGPT in assessing the quality of online content regarding extracorporeal shock wave therapy for erectile dysfunction. ### Methods - Generate a list of websites by searching "shock wave therapy for erectile dysfunction" on Google without enabling the location filter. - Analyze readability using Readable software. - Have three independent reviewers use the DISCERN tool to assess content quality. - Input the same plain - text file into ChatGPT to determine whether it can produce results that match those of reliable tools and human assessments in readability and quality assessment. ### Results - The research results show that there are significant differences between ChatGPT's readability assessment and the reliable tool Readable.com (\( p < 0.05 \)), indicating a lack of consistency between ChatGPT's algorithm and established tools such as Readable.com. - Similarly, there are also significant differences between the DISCERN scores generated by ChatGPT and those of human assessors (\( p < 0.05 \)), indicating that ChatGPT may not be able to accurately identify the quality of information sources regarding extracorporeal shock wave therapy as a treatment for erectile dysfunction. ### Conclusions ChatGPT's assessment of the quality and readability of online texts regarding extracorporeal shock wave therapy for erectile dysfunction is different from the results of human assessors and reliable tools. ChatGPT's current capabilities are insufficient to reliably assess the quality and readability of text content. Future research is needed to clarify the role of AI in objectively assessing online medical content in other fields. Continued AI development and the integration of tools such as DISCERN into AI software may enhance patients' ability to find high - quality medical information in the future.