Extraction of Sequence Conservation Features for the Prioritization of Candidate Single Amino Acid Polymorphisms

Jiaxin Wu,Mingxin Gan,Wangshu Zhang,Rui Jiang
DOI: https://doi.org/10.5815/ijieeb.2011.02.01
2011-01-01
International Journal of Information Engineering and Electronic Business
Abstract:Although remarkable success has been achieved by genome-wide association (GWA) studies over the past few years, genetic variants discovered in GWA studies can typically account for only a small fraction of heritability of most common diseases. As such, the identification of multiple rare variants that are associated with complex diseases has been receiving more and more attentions. However, most of the recently developed statistical approaches for detecting association of rare variants with diseases require the selection of functional variants before the successive analysis, making an effective bioinformatics method for filtering out non-relevant rare variants indispensible. In this paper, we focus on a specific type of genetic variants called single amino acid polymorphisms (SAAPs). We propose to prioritize candidate SAAPs for a specific disease according to their association scores that are calculated using a guilt-by-association model with a set of features derived from protein sequences. We validate the proposed approach in a systematic way and demonstrate that the proposed model is powerful in distinguishing disease-associated SAAPs for the specific disease of interest.
What problem does this paper attempt to address?