Protocol paper: A.I. Based STroke Risk fActor Classification and Treatment (ABSTRACT) study

William Heseltine-Carp,Aishwarya Kasabe,Megan Courtman,Michael Allen,Adam Streeter,Mark Thurston,hongrui wang,Lucy Mcgavin,Emmanuel Ifeachor,Stephen Mullin
DOI: https://doi.org/10.1101/2024.12.11.24318721
2024-12-11
Abstract:Background Stroke is a leading cause of death and disability in the UK. Much of stroke management revolves around addressing risk factors with medications and lifestyle modification. However, 30% of those who suffer stroke have no known risk factors. Hence, there is need to better identify individuals who are at high risk of stroke, and particularly those where the benefit of treatment outweighs the risk. ABSTRACT is a three phase study that looks to address this issue by (1) using artificial intelligence (AI) to predict stroke risk from routine hospital data, (2) to validate this model on external datasets, and (3) validate the ability to improve outcome by guiding clinical decision making. In this paper we focus on phase I of the project. Aims Phase I of this study has 4 main objectives To create four separate machine learning (ML) models to predict stroke risk from routine hospital data. One for CT/MRI/PMH brain data, one for ECG/echo/PMH data and one for laboratory test/PMH data. To perform explainability analysis on these models to identify important and novel risk factors for stroke To calibrate these models and align them with real world probabilities To combine these models to ensemble stroke prediction model Methods In this retrospective observational cohort study we will analyze data from 9155 stroke patients and 109,875 controls in southwest England. Stroke cases will be sourced from the SSNAP database and historical brain imaging (CT/MRI), ECG, echocardiography, laboratory tests, ultrasound and medical history will be obtained from hospital and GP records. These data will then be linked to form a single de-identified dataset of cases and controls. ML techniques will then be trained on these data to predict stroke risk and identify novel risk factors for stroke. Discussion This protocol paper outlines phase one of ABSTRACTs approach in creating a novel stroke risk prediction model by integrating multimodal data types such as routine brain imaging, ECG, echocardiography, and laboratory results. In particular we outline a bespoke data handling protocol in order to comply with UK ethical governance when processing large volumes of confidential data. We also discuss our strategy in cleaning and preparing data prior to ML algorithms to predict stroke.
What problem does this paper attempt to address?