Abstract:Background: The key to effective stroke management is timely diagnosis and triage. Machine learning (ML) methods developed to assist in detecting stroke have focused on interpreting detailed clinical data such as clinical notes and diagnostic imaging results. However, such information may not be readily available when patients are initially triaged, particularly in rural and underserved communities. Objective: This study aimed to develop an ML stroke prediction algorithm based on data widely available at the time of patients' hospital presentations and assess the added value of social determinants of health (SDoH) in stroke prediction. Methods: We conducted a retrospective study of the emergency department and hospitalization records from 2012 to 2014 from all the acute care hospitals in the state of Florida, merged with the SDoH data from the American Community Survey. A case-control design was adopted to construct stroke and stroke mimic cohorts. We compared the algorithm performance and feature importance measures of the ML models (ie, gradient boosting machine and random forest) with those of the logistic regression model based on 3 sets of predictors. To provide insights into the prediction and ultimately assist care providers in decision-making, we used TreeSHAP for tree-based ML models to explain the stroke prediction. Results: Our analysis included 143,203 hospital visits of unique patients, and it was confirmed based on the principal diagnosis at discharge that 73% (n=104,662) of these patients had a stroke. The approach proposed in this study has high sensitivity and is particularly effective at reducing the misdiagnosis of dangerous stroke chameleons (false-negative rate <4%). ML classifiers consistently outperformed the benchmark logistic regression in all 3 input combinations. We found significant consistency across the models in the features that explain their performance. The most important features are age, the number of chronic conditions on admission, and primary payer (eg, Medicare or private insurance). Although both the individual- and community-level SDoH features helped improve the predictive performance of the models, the inclusion of the individual-level SDoH features led to a much larger improvement (area under the receiver operating characteristic curve increased from 0.694 to 0.823) than the inclusion of the community-level SDoH features (area under the receiver operating characteristic curve increased from 0.823 to 0.829). Conclusions: Using data widely available at the time of patients' hospital presentations, we developed a stroke prediction model with high sensitivity and reasonable specificity. The prediction algorithm uses variables that are routinely collected by providers and payers and might be useful in underresourced hospitals with limited availability of sensitive diagnostic tools or incomplete data-gathering capabilities.

Enriching the Study Population for Ischemic Stroke Therapeutic Trials Using a Machine Learning Algorithm

Prediction-Driven Decision Support for Patients With Mild Stroke: A Model Based on Machine Learning Algorithms

A PREDICTION MODEL FOR STROKE BASED ON MACHINE LEARNING ALGORITHMS

Causative Classification of Ischemic Stroke by the Machine Learning Algorithm Random Forests

Enhancing Ischemic Stroke Management: Leveraging Machine Learning Models for Predicting Patient Recovery After Alteplase Treatment

A Machine Learning Approach to Support Urgent Stroke Triage Using Administrative Data and Social Determinants of Health at Hospital Presentation: Retrospective Study

Prediction of large vessel occlusion for ischaemic stroke by using the machine learning model random forests

Prediction of Long-Term Stroke Recurrence Using Machine Learning Models

Machine learning-based prognostication of mortality in stroke patients

Predictive etiological classification of acute ischemic stroke through interpretable machine learning algorithms: a multicenter, prospective cohort study

Exploring Machine Learning for Predicting Cerebral Stroke: A Study in Discovery

Machine Learning for Outcome Prediction of Acute Ischemic Stroke Post Intra-Arterial Therapy

Interpretable Machine Learning Modeling for Ischemic Stroke Outcome Prediction

Development and Validation of Machine Learning Algorithms to Predict 1-Year Ischemic Stroke and Bleeding Events in Patients with Atrial Fibrillation and Cancer

Machine Learning-Based Prediction of Stroke in Emergency Departments

Abstract WP203: Prediction of Outcome After Endovascular Therapy for Acute Ischemic Stroke: A Machine Learning Approach From Japan Stroke Data Bank

Predicting 90-Day Prognosis in Ischemic Stroke Patients Post Thrombolysis Using Machine Learning

Machine learning-based prediction of early neurological deterioration after intravenous thrombolysis for stroke: insights from a large multicenter study

Machine learning prediction of hospital discharge disposition for inpatients with acute ischemic stroke following mechanical thrombectomy in the United States

Predicting Clinical Outcomes in Acute Ischemic Stroke Patients Undergoing Endovascular Thrombectomy with Machine Learning

Prediction of Clinical Outcome in Patients with Large-Vessel Acute Ischemic Stroke: Performance of Machine Learning versus SPAN-100