Evaluating the effectiveness of artificial intelligence-based tools in detecting and understanding sleep health misinformation: Comparative analysis using Google Bard and OpenAI ChatGPT-4

Sergio Garbarino,Nicola Luigi Bragazzi
DOI: https://doi.org/10.1111/jsr.14210
2024-04-05
Abstract:This study evaluates the performance of two major artificial intelligence-based tools (ChatGPT-4 and Google Bard) in debunking sleep-related myths. More in detail, the present research assessed 20 sleep misconceptions using a 5-point Likert scale for falseness and public health significance, comparing responses of artificial intelligence tools with expert opinions. The results indicated that Google Bard correctly identified 19 out of 20 statements as false (95.0% accuracy), not differing from ChatGPT-4 (85.0% accuracy, Fisher's exact test p = 0.615). Google Bard's ratings of the falseness of the sleep misconceptions averaged 4.25 ± 0.70, showing a moderately negative skewness (-0.42) and kurtosis (-0.83), and suggesting a distribution with fewer extreme values compared with ChatGPT-4. In assessing public health significance, Google Bard's mean score was 2.4 ± 0.80, with skewness and kurtosis of 0.36 and -0.07, respectively, indicating a more normal distribution compared with ChatGPT-4. The inter-rater agreement between Google Bard and sleep experts had an intra-class correlation coefficient of 0.58 for falseness and 0.69 for public health significance, showing moderate alignment (p = 0.065 and p = 0.014, respectively). Text-mining analysis revealed Google Bard's focus on practical advice, while ChatGPT-4 concentrated on theoretical aspects of sleep. The readability analysis suggested Google Bard's responses were more accessible, aligning with 8th-grade level material, versus ChatGPT-4's 12th-grade level complexity. The study demonstrates the potential of artificial intelligence in public health education, especially in sleep health, and underscores the importance of accurate, reliable artificial intelligence-generated information, calling for further collaboration between artificial intelligence developers, sleep health professionals and educators to enhance the effectiveness of sleep health promotion.
What problem does this paper attempt to address?