BT07 (P042) Developing a clinical audit methodology for monitoring dermatologist performance in artificial intelligence-enabled teledermatology pathways

Joshua Luck,Audrey Menezes,Dilraj Kalsi,Dan Mullarkey,Niall Wilson
DOI: https://doi.org/10.1093/bjd/ljae090.405
IF: 11.113
2024-06-28
British Journal of Dermatology
Abstract:Abstract Teledermatology can help support the timely diagnosis of skin cancer. NHS England recently published a roadmap to accelerate the roll-out of teledermatology services nationally, including an updated series of audit and quality control standards. However, there remains no consensus as to the ideal audit methodology, which contributes to a paucity of comparative teledermatology evidence. This study aims to describe a reproducible audit methodology for monitoring clinician performance in teledermatology pathways. We present a novel quantitative risk scoring system – the Teledermatology Audit Risk Scoring System (TARSS) – that draws upon our experience in monitoring the safety and effectiveness of an artificial intelligence (AI) as a medical device product in NHS skin cancer pathways. By adapting our clinical risk management system (Table), we are able to assess clinician performance by applying risk scores to confusion matrices for both management and diagnostic accuracy. Thirty nonconsecutive cases were reviewed by 10 experienced teledermatologists (all NHS consultants working in different units across the UK). Reflecting the typical case mix, lesions with both histology and clinical ground truths were included; 27 cases were selected at random and three known cancers were included. Overall, 90% of dermatologists missed between two and four high-risk malignancies each; only one dermatologist correctly identified all high-risk cancers. Management accuracy stratified by lesion risk demonstrated clinician sensitivity of 78.3% (95% confidence interval 70.2–84.8) and specificity of 47.5% (95% confidence interval 39.3–55.8). There were no significant differences in management accuracy between dermatologists (one-way AnovaP = 0.71), although there was only ‘moderate’ interrater agreement (Fleiss’ kappa 0.43). Similar performance was seen in diagnostic accuracy. Anonymized results were discussed in a dedicated forum and each dermatologist was provided with personalized feedback. This study describes an innovative teledermatology audit methodology that uses a quantitative risk-based approach to assess clinician performance, in terms of both management and diagnostic accuracy. We are now collaborating with a number of NHS partners to audit local teledermatology services using this methodology. Real-world cases were selected from multiple skin cancer pathway AI deployments, including some funded by an NHS AI in Health and Care Award. All authors are employed by the AI provider, which runs regular audits as part of its clinical governance framework.Risk scoreDescription4 MajorIncorrectly diagnosed malignant lesion (false negative)Diagnostic delay > 2 months3 SignificantIncorrectly diagnosed premalignant lesion (false negative)Diagnostic delay > 7 days2 ModerateIncorrectly diagnosed lesion (false positive) resulting in inappropriate referral for investigationDiagnostic delay < 7 days1 MinorInconvenience or minor psychological upset0 No riskNo risk
dermatology
What problem does this paper attempt to address?