Biomarker Discovery and Metabolic Profiling in Serum of Cardiovascular Disease Patients with Untargeted Metabolomics and Machine Learning.
Xia Shen,Shuyuan Guo,Ningning Liang,Mingming Zhao,Chun Wang,Zi Li,Dewen Yan,Lemin Zheng,Huiyong Yin
DOI: https://doi.org/10.1002/ctm2.1722
IF: 8.554
2024-01-01
Clinical and Translational Medicine
Abstract:Dear Editor, Here, we employed a nontargeted metabolomics to examine serum metabolic profiles in a Chinese cohort of 243 patients with coronary heart disease (CHD) and myocardial infarction (MI), and identified 48 and 46 differential metabolites to distinguish CHD and MI from control, respectively. Employing statistical and Least Absolute Shrinkage and Selection Operator (LASSO) methodology, we built a model based on three polar metabolites, arginine, hypoxanthine and acetylcarnitine, to discriminate CHD from MI with an area under the curve (AUC) of .92 and .88 in the training and test set, respectively. Metabolomics emerges as an enabling technique to identify circulating metabolites as potential disease biomarkers, including cardiovascular diseases (CVD).1 Although dysregulation of lipid metabolism has been implicated in CVD, yet a comprehensive analysis of their underlying metabolomic profiles and disease-specific metabolic biomarkers is lacking, especially for polar metabolites.2 In a large cohort of 10741samples, Zeller et al. found five phosphatidylcholines (PCs) were negatively correlated with CHD,3 while Wittenbecher et al. found a significant correlation of Ceramide 16:0 and PC 32:0 in heart failure.4 Furthermore, combining metabolomics with machine learning algorithms holds enormous promise to build better diagnostic models for various human diseases.5, 6 In this study, we collected serum samples from 73 MI patients, 83 CHD patients and 87 controls and identified a total of 702 metabolites using nontargeted metabolomics (Table 1, Figure 1). Principal component analysis (PCA) was conducted to evaluate the intrinsic metabolic variations and data quality. As shown in Figure 2A, the tight clustering of quality control (QC) samples indicated good reproducibility of our metabolomic data. The MI patients showed a better separation from the control than CHD. This observation suggests smaller metabolic alterations in patients with CHD relative to MI and control groups. To gain a further insight into the metabolomic profiles among three groups, we performed a supervised partial least-squares discriminant analysis (PLS-DA)7 (Figure 2B) and found that the MI group showed good separation from the CHD and control groups. The CHD group also showed better separation from the control groups, and the Q2 and R2 of PLS-DA is .671 and .857, respectively, indicative of robust performance. We next performed a univariate analysis to define significantly differentiated metabolites between three groups (FDR < .05). To this end, 80 metabolites were significantly dysregulated among three groups (Figure 2C). We subjected these 80 metabolites to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis to explore the metabolic pathways dysregulated during CHD and MI progression. We observed five significantly dysregulated metabolic pathways (Figure 2D). Among the 80 differentiated metabolites, we found 15 metabolites with unidirectional trend during the disease progression trajectory, from healthy control, to CHD and MI (Figure 2E). After the systematic analysis of the metabolic profiles among three groups, we selected the metabolites that specifically altered in patients with CHD as compared with the control group. We filtered 48 metabolites that were significantly dysregulated (Figure 2C), and details of the upregulated and downregulated metabolites are shown in Figure 3A. Overall, 37 upregulated and 11 downregulated metabolites were subjected to KEGG pathway enrichment analysis, and alpha-linolenic acid metabolism was the significantly dysregulated pathway in CHD patients (Figure 3B). Next, we identified 57 metabolites specifically dysregulated in MI patients (Figure 2C) with 30 upregulated and 27 downregulated metabolites (Figure 3C). KEGG pathway analysis found three significantly dysregulated pathways: fructose and mannose metabolism, glycolysis/gluconeogenesis, and alanine, aspartate and glutamate metabolism pathways (Figure 3D). It is a clinical challenge to predict whether or when patients with CHD will develop MI. Here we found significant metabolic alterations between CHD and MI in PCA and PLS-DA analyses, in which 46 metabolites were specifically dysregulated (Figure 2C) with 39 downregulated and 7 upregulated metabolites (Figure 3E), respectively. KEGG pathway analysis found two dysregulated pathways: galactose metabolism and neomycin, kanamycin and gentamicin biosynthesis pathway (Figure 3F). Following comprehensive metabolic profiling, we performed biomarker discovery analysis to predict CHD and MI and differentiate individual patients based on their serum metabolomics data. In this direction, we established a LASSO regression model to select a panel of metabolites as potential biomarkers by randomly grouping 2/3 of the samples into a training set and the rest of samples as a test set.8 To increase the confidence and reproducibility of analysis, we only chose the metabolites with molecular ions of [M+H]+ and [M-H]− in MS1 scan and Grade 1 classification based on Metabolomics Standards Initiative (MSI) criteria.9 We subsequently performed a receiver operator characteristic (ROC) curve analysis on the training and test datasets.10 We identified three metabolites, acetylcarnitine, arginine and hypoxanthine, as potential biomarkers in the LASSO model to differentiate MI from CHD (Figure 4A). The importance of these putative biomarkers was shown in Figure 4B. The AUC was .92 and .88 in the training and test set, respectively (Figure 4C), and the relative abundance of putative biomarkers was illustrated using a boxplot (Figure 4D). Four metabolites, sphingomyelin, citrulline, glutamate and hexadecenoic acid, were selected to differentiate CHD from control groups with an AUC of .82 and .79 in the training and test set, respectively (Figure S1A). Glutamate was the most important metabolite in our statistic model (Figure S1B and C). Interestingly, these four metabolites were all upregulated in the CHD group (Figure S1D). Lastly, four metabolites, acetylcarnitine, 5-methylthioaenosine, salicyluric acid and hypoxanthine, were used to distinguish MI patients from the control (Figure S2A). The AUC was .95 and .94 in the training and test set, respectively (Figure S2B and C). 5-Methylthioaenosine was the most important metabolite in this prediction model (Figure S2B) and was the only downregulated metabolite among these putative biomarkers (Figure S2D). Finally, we structurally validated all these putative biomarkers using commercial standards (Figure S3). Notably, we attempted to incorporate conventional clinical indices, such as LDL-C, ApoB and apoA1, into our predictive models but did not observe significant improvements (data not shown). In conclusion, the prediction models utilising serum metabolites selected from metabolomics and machine learning offers great potential in improving diagnosis of CHD and MI after validated in multicentred cohort studies. Xia Shen, Dewen Yan, Lemin Zheng and Huiyong Yin conceived the project and wrote the manuscript. Xia Shen, Huiyong Yin and Ningning Liang designed, performed experiments and analysed the data. Shuyuan Guo, Zi Li and Ningning Liang helped with LCMS experiments. Mingming Zhao and Chun Wang helped with sample collection and provided patient information. Xia Shen and Huiyong Yin edited the manuscript. Huiyong Yin and Lemin Zheng supervised the work. We thank the mass spectrometry platform, molecular biology/biochemistry/cell technology platform, experimental animal platform and biological sample pathology analysis platform at SINH, CAS. This research was funded by grants from the National Key Research and Development Program of China (2022YFC2503300) and National Natural Science Foundation of China (32241017 and 32030053), Shenzhen Medical Research Fund (SMRF B2302042), Science and Technology Commission of Shanghai Municipality (22140903300), and startup funds from the City University of Hong Kong (9380154 and 7006046), RGC Theme-based Research Scheme (8770011), and TBSC Project fund. DWY was supported by Shenzhen Clinical Research Center for Metabolic Diseases (Shenzhen Science, Technology and Innovation [2021]287), Shenzhen Center for Diabetes Control and Prevention (NO: SZMHC [2020]46). None declared. This study was approved by the Ethics Committee of the Fujian Provincial Hospital, which is in accordance with the Declaration of Helsinki. Informed consents were obtained from all participants. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.