A machine learning prediction model for Cardiac Amyloidosis using routine blood tests in patients with left ventricular hypertrophy

Yuling Pan,Qingkun Fan,Yu Liang,Yunfan Liu,Haihang You,Chunzi Liang
DOI: https://doi.org/10.1038/s41598-024-77466-8
2024-11-19
Abstract:Current approaches for cardiac amyloidosis (CA) identification are time-consuming, labor-intensive, and present challenges in sensitivity and accuracy, leading to limited treatment efficacy and poor prognosis for patients. In this retrospective study, we aimed to leverage machine learning (ML) to create a diagnostic model for CA using data from routine blood tests. Our dataset included 6,563 patients with left ventricular hypertrophy, 261 of whom had been diagnosed with CA. We divided the dataset into training and testing cohorts, applying ML algorithms such as logistic regression, random forest, and XGBoost for automated learning and prediction. Our model's diagnostic accuracy was then evaluated against CA biomarkers, specifically serum-free light chains (FLCs). The model's interpretability was elucidated by visualizing the feature importance through the gain map. XGBoost outperformed both random forest and logistic regression in internal validation on the testing cohort, achieving an area under the curve (AUC) of 0.95 (95%CI: 0.92-0.97), sensitivity of 0.92 (95%CI: 0.86-0.98), specificity of 0.95 (95%CI: 0.94-0.97), and an F1 score of 0.89 (95%CI: 0.85-0.92). Its performance was also superior to the serum FLC-kappa and FLC-lambda combination (AUC of 0.88). Furthermore, XGBoost identified unique biomarker signatures indicative of multisystem dysfunction in CA patients, with significant changes in eGFR, FT3, cTnI, ANC, and NT-proBNP. This study develops a highly sensitive and accurate ML model for CA detection using routine clinical laboratory data, effectively streamlining diagnostic procedures, and providing valuable clinical insights and guiding future research into disease mechanisms.
What problem does this paper attempt to address?