Data-driven clustering approach to identify novel clusters of high cognitive impairment risk among Chinese community-dwelling elderly people with normal cognition: A national cohort study

Wang Ran,Qiutong Yu
DOI: https://doi.org/10.7189/jogh.14.04088
2024-04-19
Abstract:Background: Cognitive impairment is a highly heterogeneous disorder that necessitates further investigation into the distinct characteristics of populations at varying risk levels of cognitive impairment. Using a large-scale registry cohort of elderly individuals, we applied a data-driven approach to identify novel clusters based on diverse sociodemographic features. Methods: A prospective cohort of 6398 elderly people from the Chinese Longitudinal Healthy Longevity Survey, followed between 2008-14, was used to develop and validate the model. Participants were aged ≥60 years, community-dwelling, and the Chinese version of the Mini-Mental State Examination (MMSE) score ≥18 were included. Sixty-nine sociodemographic features were included in the analysis. The total population was divided into two-thirds for the derivation cohort (n = 4265) and one-third for the validation cohort (n = 2133). In the derivation cohort, an unsupervised Gaussian mixture model was applied to categorise participants into distinct clusters. A classifier was developed based on the most important 10 factors and was applied to categorise participants into their corresponding clusters in a validation cohort. The difference in the three-year risk of cognitive impairment was compared across the clusters. Results: We identified four clusters with distinct features in the derivation cohort. Cluster 1 was associated with the worst life independence, longest sleep duration, and the oldest age. Cluster 2 demonstrated the highest loneliness, characterised by non-marital status and living alone. Cluster 3 was characterised by the lowest sense of loneliness and the highest proportions in marital status and family co-residence. Cluster 4 demonstrated heightened engagement in exercise and leisure activity, along with independent decision-making, hygiene, and a diverse diet. In comparison to Cluster 4, Cluster 1 exhibited the highest three-year cognitive impairment risk (adjusted odds ratio (aOR) = 3.31; 95% confidence interval (CI) = 1.81-6.05), followed by Cluster 2 and Cluster 3 after adjustment for baseline MMSE, residence, sex, age, years of education, drinking, smoking, hypertension, diabetes, heart disease and stroke or cardiovascular diseases. Conclusions: A data-driven approach can be instrumental in identifying individuals at high risk of cognitive impairment among cognitively normal elderly populations. Based on various sociodemographic features, these clusters can suggest individualised intervention plans.
What problem does this paper attempt to address?