Appraisal of AI‐generated dermatology literature reviews

Lauren Passby,Vidya Madhwapathi,Simon Tso,Aaron Wernham
DOI: https://doi.org/10.1111/jdv.20237
2024-07-14
Journal of the European Academy of Dermatology and Venereology
Abstract:Summary of the AI tools used (Tools), the topics assessed (Topics) and the total scores received from all three raters across all five categories for all AI tools (Scores). Background Artificial intelligence (AI) tools have the potential to revolutionize many facets of medicine and medical sciences research. Numerous AI tools have been developed and are in continuous states of iterative improvement in their functionality. Objectives This study aimed to assess the performance of three AI tools: The Literature, Microsoft's Copilot and Google's Gemini in performing literature reviews on a range of dermatology topics. Methods Each tool was asked to write a literature review on five topics. The topics chosen have recently had peer‐reviewed systematic reviews published. The outputs of each took were graded on their evidence and analysis, conclusions and references on a 5‐point Likert scale by three dermatologists who are working in clinical practice, have completed the UK dermatology postgraduate training examination and are partaking in continued professional development. Results Across all five topics chosen, the literature reviews written by Gemini scored the highest. The mean score for Gemini for each review was 10.53, significantly higher than the mean scores achieved by The Literature (7.73) and Copilot (7.4) (p
dermatology
What problem does this paper attempt to address?