Abstract:Background The existing dementia risk models are limited to known risk factors and traditional statistical methods. We aimed to employ machine learning (ML) to develop a novel dementia prediction model by leveraging a rich-phenotypic variable space of 366 features covering multiple domains of health-related data. Methods In this longitudinal population-based cohort of the UK Biobank (UKB), 425,159 non-demented participants were enrolled from 22 recruitment centres across the UK between March 1, 2006 and October 31, 2010. We implemented a data-driven strategy to identify predictors from 366 candidate variables covering a comprehensive range of genetic and environmental factors and developed the ML model to predict incident dementia and Alzheimer's Disease (AD) within five, ten, and much longer years (median 11.9 [Interquartile range 11.2-12.5] years). Findings During a follow-up of 5,023,337 person-years, 5287 and 2416 participants developed dementia and AD, respectively. A novel UKB dementia risk prediction (UKB-DRP) model comprising ten predictors including age, ApoE e4, pairs matching time, leg fat percentage, number of medications taken, reaction time, peak expiratory flow, mother's age at death, long-standing illness, and mean corpuscular volume was established. Our prediction model was internally evaluated based on five-fold cross-validation on discrimination and calibration, and it was further compared with existing prediction scales. The UKB-DRP model can achieve high discriminative accuracy in dementia (AUC 0.848 +/- 0.007) and even better in AD (AUC 0.862 +/- 0.015). The model was well-calibrated (Hosmer-Lemeshow goodness-of-fit p-value = 0.92), and the predictive power was solid in different incidence time groups. More importantly, our model presented an apparent superiority over existing models like Cardiovascular Risk Factors, Aging, and Incidence of Dementia Risk Score (AUC 0.705 +/- 0.008), the Dementia Risk Score (AUC 0.752 +/- 0.007), and the Australian National University Alzheimer's Disease Risk Index (AUC 0.584 +/- 0.017). The model was internally validated in the general population of European ancestry and White ethnicity; thus, further validation with independent datasets is necessary to confirm these findings. Interpretation Our ML-based UKB-DRP model incorporated ten easily accessible predictors with solid predictive power for incident dementia and AD within five, ten, and much longer years, which can be used to identify individuals at high risk of dementia and AD in the general population. Copyright (C) 2022 The Author(s). Published by Elsevier Ltd.

Development of machine learning-based models to predict 10-year risk of cardiovascular disease: a prospective cohort study

Improving Cardiovascular Risk Prediction Through Machine Learning Modelling of Irregularly Repeated Electronic Health Records

A Cox-Based Risk Prediction Model for Early Detection of Cardiovascular Disease: Identification of Key Risk Factors for the Development of a 10-Year CVD Risk Prediction

Deep Phenotyping and Prediction of Long-term Cardiovascular Disease: Optimized by Machine Learning

Development of a Novel Dementia Risk Prediction Model in the General Population: A Large, Longitudinal, Population-Based Machine-Learning Study

A Cardiovascular Disease Prediction Model Based on Routine Physical Examination Indicators Using Machine Learning Methods: A Cohort Study

Machine Learning Outperforms Traditional Logistic Regression and Offers New Possibilities for Cardiovascular Risk Prediction: A Study Involving 143,043 Chinese Patients with Hypertension

A comparative study of model-centric and data-centric approaches in the development of cardiovascular disease risk prediction models in the UK Biobank

Development of a Model to Predict 10-Year Risk of Ischemic and Hemorrhagic Stroke and Ischemic Heart Disease Using the China Kadoorie Biobank

A Machine Learning Model Based on Genetic and Traditional Cardiovascular Risk Factors to Predict Premature Coronary Artery Disease

Improving Cardiovascular Disease Risk Prediction With Machine Learning Using Mental Health Data: A Prospective Uk Biobank Study

Improving Cardiovascular Disease Prediction With Machine Learning Using Mental Health Data: A Prospective UK Biobank Study

Machine Learning-Based Cardiovascular Disease Prediction Model: A Cohort Study on the Korean National Health Insurance Service Health Screening Database

Using machine learning-based algorithms to construct cardiovascular risk prediction models for Taiwanese adults based on traditional and novel risk factors

Machine learning for the prediction of atherosclerotic cardiovascular disease during 3-year follow up in Chinese type 2 diabetes mellitus patients

Stroke Risk Prediction Using Machine Learning: a Prospective Cohort Study of 0.5 Million Chinese Adults

Machine-learning versus traditional approaches for atherosclerotic cardiovascular risk prognostication in primary prevention cohorts: a systematic review and meta-analysis

Study on the risk of coronary heart disease in middle-aged and young people based on machine learning methods: a retrospective cohort study

Comparing the performance of machine learning and conventional models for predicting atherosclerotic cardiovascular disease in a general Chinese population

Development and validation of risk prediction model for recurrent cardiovascular events among Chinese: the Personalized CARdiovascular DIsease risk Assessment for Chinese model

Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach