Gsw-fi: a GLM model incorporating shrinkage and double-weighted strategies for identifying cancer driver genes with functional impact

Xiaolu Xu,Zitong Qi,Lei Wang,Meiwei Zhang,Zhaohong Geng,Xiumei Han
DOI: https://doi.org/10.1186/s12859-024-05707-8
IF: 3.307
2024-03-07
BMC Bioinformatics
Abstract:Cancer, a disease with high morbidity and mortality rates, poses a significant threat to human health. Driver genes, which harbor mutations accountable for the initiation and progression of tumors, play a crucial role in cancer development. Identifying driver genes stands as a paramount objective in cancer research and precision medicine.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
The problem this paper attempts to address is: how to efficiently identify cancer driver genes with functional impact. Specifically, the authors propose a new method—GSW-FI (Generalized Linear Regression Model with Shrinkage and Double-Weighted Strategies for Identifying Cancer Driver Genes with Functional Impact), aiming to improve the accuracy of identifying cancer driver genes by integrating background estimation of gene functional impact, shrinkage strategies, and double-weighted strategies. ### Background Cancer is a disease with high incidence and high mortality, posing a serious threat to human health. Driver genes are those that carry mutations leading to tumor initiation and progression, playing a crucial role in cancer development. Therefore, identifying driver genes is an important goal in cancer research and precision medicine. ### Methods 1. **Background Functional Impact Assessment**: - Use a Generalized Linear Regression Model (GLM) to assess the Background Functional Impact Score (BFIS) of genes, using gene features as predictors. 2. **Shrinkage Strategy**: - Smooth the estimated values using information from neighboring genes to reduce bias and improve the stability of the estimates. 3. **Double-Weighted Strategy**: - Employ two independent weighting strategies: the proportion of deleterious mutations and the exponential proportion of deleterious mutations in samples, to reasonably assess the functional impact of genes. 4. **Hypothesis Testing**: - Design statistical methods to identify potential cancer driver genes by comparing the observed Functional Impact Scores (FIS) with the distribution of background functional impact scores. ### Results - Experimental results show that GSW-FI outperforms ten other prediction methods across 31 TCGA datasets, particularly excelling in the overlap ratio with known databases and consensus predictions among different methods. ### Conclusion GSW-FI provides a new computational method that can efficiently identify cancer driver genes with functional impact, thereby advancing the development of precision medicine in cancer.