Identification of Disease-Related Nssnps Via the Integration of Protein Sequence Features and Domain-Domain Interaction Data.

Rui Jiang,Mingxin Gan,Jiaxin Wu
DOI: https://doi.org/10.1504/ijcbdd.2012.049204
2012-01-01
International Journal of Computational Biology and Drug Design
Abstract:Recent studies have suggested the common disease-rare variant (CD-RV) hypothesis in the mapping of disease-related genetic variants and have proposed a number of statistical methods to detect associations between rare variants and human inherited diseases. However, most of these methods take the selection of functional variants as a preliminary step in order to maximise the power of statistical tests. To meet this end, we put forward a filtration approach to identify genetic variants that are potentially associated with a query disease of interest from the perspective of one-class novelty learning. We propose to prioritise candidate non-synonymous single nucleotide polymorphisms (nsSNPs) relying on the integrated use of two sequence conservation properties of amino acids calculated from multiple sequence alignment of protein sequences and one functional similarity measure derived from domain-domain interaction data. We show the power of this approach in the detection of disease-related nsSNP via large-scale leave-one-out cross-validation experiments.
What problem does this paper attempt to address?