Predicting Pathogenic Single Nucleotide Variants Through a Comprehensive Analysis on Multiple Level Features

Yiming Wu,Qifan Kuang,Yongcheng Dong,Ziyan Huang,Yan Li,Yizhou Li,Menglong Li
DOI: https://doi.org/10.1016/j.chemolab.2016.05.012
IF: 4.175
2016-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:Benefiting from the high-throughput sequencing technologies, many single nucleotide variants (SNVs) among individuals have been detected. SNVs in gene code regions were known to possibly disrupt protein functions. For this, many efforts were devoted to sort deleterious SNVs from benign ones. In general, features in the past studies can be categorized into codon level, peptide level and protein level. While those at peptide level were in widespread use, few works have carried out a comprehensive analysis by combining three levels information.In the present work, we incorporated both codon and protein level information with peptide level information to predict disease-related SNVs. Taking the advantage of combinatory multiple level features, our method exhibited competitive performance against seven well-known classifiers. Additionally, by incorporating selective pressure score and protein–protein interaction (PPI) information, we found that the functional important proteins were protected through a pressure-resistant mechanism during the evolution. Although critical proteins were obviously related with more deleterious SNVs, these pathogenic SNVs were tend to under higher selective pressures comparing to the benign variants. These results support the ongoing researches about relation between genotype and phenotype.
What problem does this paper attempt to address?