Predicting Risk of Dementia with Survival Machine Learning and Statistical Methods: Results on the English Longitudinal Study of Ageing Cohort

Daniel Stamate,Henry Musto,Olesya Ajnakina,Daniel Stahl
2023-06-17
Abstract:Machine learning models that aim to predict dementia onset usually follow the classification methodology ignoring the time until an event happens. This study presents an alternative, using survival analysis within the context of machine learning techniques. Two survival method extensions based on machine learning algorithms of Random Forest and Elastic Net are applied to train, optimise, and validate predictive models based on the English Longitudinal Study of Ageing ELSA cohort. The two survival machine learning models are compared with the conventional statistical Cox proportional hazard model, proving their superior predictive capability and stability on the ELSA data, as demonstrated by computationally intensive procedures such as nested cross-validation and Monte Carlo validation. This study is the first to apply survival machine learning to the ELSA data, and demonstrates in this case the superiority of AI based predictive modelling approaches over the widely employed Cox statistical approach in survival analysis. Implications, methodological considerations, and future research directions are discussed.
Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the following issues: 1. **Predicting the Onset Time of Dementia**: The paper proposes a method to predict the onset time of dementia using survival machine learning techniques (such as survival random forests and survival elastic nets) and traditional statistical methods (such as the Cox proportional hazards model). The goal is to improve the accuracy and stability of predictions through these methods. 2. **Surpassing Traditional Classification Methods**: Traditional classification methods often ignore the timing of events. This paper adopts survival analysis methods, utilizing machine learning techniques to predict the specific onset time of dementia, thereby obtaining more clinically relevant information. 3. **Comparing the Effectiveness of Different Models**: The paper compares two survival machine learning models (survival random forests and survival elastic nets) with the traditional Cox proportional hazards model, demonstrating the superiority of machine learning models in terms of predictive ability and stability. 4. **Feature Importance Analysis**: Through the survival random forest model, the paper also explores important variables affecting the onset time of dementia, including age, cognitive function, sleep quality, etc., and provides actionable prevention recommendations. 5. **Dataset Validation**: The paper uses data from the English Longitudinal Study of Ageing (ELSA) and validates the performance and stability of the models through methods such as nested cross-validation and Monte Carlo validation. Overall, this paper aims to demonstrate the advantages of survival machine learning methods in predicting the onset time of dementia and to provide new ideas for future clinical predictive modeling.