Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires

Riku Nambo,Shigehiro Karashima,Ren Mizoguchi,Seigo Konishi,Atsushi Hashimoto,Daisuke Aono,Mitsuhiro Kometani,Kenji Furukawa,Takashi Yoneda,Kousuke Imamura,Hidetaka Nambo
DOI: https://doi.org/10.1038/s41598-024-61047-w
IF: 4.6
2024-05-09
Scientific Reports
Abstract:Cardiovascular and cerebrovascular diseases (CCVD) are prominent mortality causes in Japan, necessitating effective preventative measures, early diagnosis, and treatment to mitigate their impact. A diagnostic model was developed to identify patients with ischemic heart disease (IHD), stroke, or both, using specific health examination data. Lifestyle habits affecting CCVD development were analyzed using five causal inference methods. This study included 473,734 patients aged ≥ 40 years who underwent specific health examinations in Kanazawa, Japan between 2009 and 2018 to collect data on basic physical information, lifestyle habits, and laboratory parameters such as diabetes, lipid metabolism, renal function, and liver function. Four machine learning algorithms were used: Random Forest, Logistic regression, Light Gradient Boosting Machine, and eXtreme-Gradient-Boosting (XGBoost). The XGBoost model exhibited superior area under the curve (AUC), with mean values of 0.770 (± 0.003), 0.758 ( ± 0.003), and 0.845 ( ± 0.005) for stroke, IHD, and CCVD, respectively. The results of the five causal inference analyses were summarized, and lifestyle behavior changes were observed after the onset of CCVD. A causal relationship from 'reduced mastication' to 'weight gain' was found for all causal species theory methods. This prediction algorithm can screen for asymptomatic myocardial ischemia and stroke. By selecting high-risk patients suspected of having CCVD, resources can be used more efficiently for secondary testing.
multidisciplinary sciences
What problem does this paper attempt to address?
The paper aims to address the prediction and causal inference problems of cardiovascular and cerebrovascular diseases (CCVD). Specifically, the study developed a diagnostic model using specific health check-up data to identify patients with ischemic heart disease (IHD), stroke, or both. The study analyzed data from 473,734 participants aged 40 and above in Kanazawa, Japan, collecting basic physical information, lifestyle habits, and laboratory parameters. To achieve this goal, the study employed four machine learning algorithms: Random Forest, Logistic Regression, Light Gradient Boosting Machine (LGBM), and eXtreme Gradient Boosting (XGBoost). Among them, the XGBoost model demonstrated the best performance, with areas under the curve (AUC) of 0.770 for stroke, 0.758 for IHD, and 0.845 for CCVD. Additionally, the study analyzed the lifestyle habits affecting the development of CCVD using five causal inference methods and discovered causal relationships between lifestyle changes and the onset of CCVD. For example, there is a causal link between "reduced chewing" and "weight gain." This predictive algorithm can screen for asymptomatic myocardial ischemia and stroke, thereby more efficiently utilizing medical resources by selecting high-risk patients for further examination.