A clustering approach to improve our understanding of the genetic and phenotypic complexity of chronic kidney disease

A. Eoli,S. Ibing,C. Schurmann,G. N. Nadkarni,H. O. Heyne,E. Böttinger
DOI: https://doi.org/10.1038/s41598-024-59747-4
IF: 4.6
2024-04-27
Scientific Reports
Abstract:Chronic kidney disease (CKD) is a complex disorder that causes a gradual loss of kidney function, affecting approximately 9.1% of the world's population. Here, we use a soft-clustering algorithm to deconstruct its genetic heterogeneity. First, we selected 322 CKD-associated independent genetic variants from published genome-wide association studies (GWAS) and added association results for 229 traits from the GWAS catalog. We then applied nonnegative matrix factorization (NMF) to discover overlapping clusters of related traits and variants. We computed cluster-specific polygenic scores and validated each cluster with a phenome-wide association study (PheWAS) on the Bio Me biobank (n = 31,701). NMF identified nine clusters that reflect different aspects of CKD, with the top-weighted traits signifying areas such as kidney function, type 2 diabetes (T2D), and body weight. For most clusters, the top-weighted traits were confirmed in the PheWAS analysis. Results were found to be more significant in the cross-ancestry analysis, although significant ancestry-specific associations were also identified. While all alleles were associated with a decreased kidney function, associations with CKD-related diseases (e.g., T2D) were found only for a smaller subset of variants and differed across genetic ancestry groups. Our findings leverage genetics to gain insights into the underlying biology of CKD and investigate population-specific associations.
multidisciplinary sciences
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the genetic heterogeneity and phenotypic complexity of chronic kidney disease (CKD). Specifically, the author used a soft - clustering algorithm (Non - negative Matrix Factorization, NMF) to analyze the genetic variations related to CKD, in order to better understand the different genetic subtypes of CKD and their biological mechanisms. Through this method, the author hopes to identify different genetic clusters and verify the associations between these clusters and specific phenotypes, thus providing new insights for the clinical management and personalized treatment of CKD. ### Main research steps: 1. **Data collection**: 322 independent genetic variations related to CKD were selected from published Genome - Wide Association Studies (GWAS), and the association results of 229 related traits were obtained from the GWAS catalog. 2. **Application of NMF**: The Non - negative Matrix Factorization (NMF) method was used to perform cluster analysis on these variations and traits, and overlapping trait and variation clusters were found. 3. **Calculation of polygenic scores**: Cluster - specific polygenic scores (PGS) were calculated according to the weights of each cluster. 4. **Verification and interpretation**: A Phenome - Wide Association Study (PheWAS) was carried out in the BioMe biobank to verify the phenotypic associations of each cluster. ### Research findings: - **Nine genetic clusters**: The NMF method identified nine genetic clusters reflecting different aspects of CKD, and the top - level features of these clusters are related to renal function, type 2 diabetes (T2D), body weight, etc. - **Phenotypic verification**: The top - level features of most clusters were confirmed in the PheWAS analysis, especially more significant in the cross - ancestry analysis, but significant ancestry - specific associations were also found. - **Influence of genetic variations**: All alleles are related to the decline of renal function, but the associations with CKD - related diseases (such as T2D) were only found in a small number of variations and differed among different genetic - ancestry groups. ### Significance: - **Insight into biological mechanisms**: Through genetic means, the study provides new insights into the potential biological mechanisms of CKD. - **Population - specific associations**: The study also explored the differences in genetic associations among different - ancestry groups, emphasizing the importance of conducting research in different populations. - **Potential for clinical applications**: In the future, patients can be classified by genotype to provide more personalized and genetic - information - based clinical management plans, thereby improving the care and treatment outcomes of CKD patients. In conclusion, this study reveals the genetic complexity of CKD through the cluster analysis method and provides an important basis for future clinical research and personalized medicine.