Approximation Algorithms for the Selection of Robust Tag SNPs

Yao-Ting Huang,Kui Zhang,Ting Chen,Kun-Mao Chao
DOI: https://doi.org/10.1007/978-3-540-30219-3_24
2007-01-01
Abstract:Recent studies have shown that the chromosomal recombination only takes places at some narrow hotspots. Within the chromosomal region between these hotspots (called haplotype block), little or even no recombination occurs, and a small subset of SNPs (called tag SNPs) is sufficient to capture the haplotype pattern of the block. In reality, the tag SNPs may be genotyped as missing data, and we may fail to distinguish two distinct haplotypes due to the ambiguity caused by missing data. In this paper, we formulate this problem as finding a set of SNPs (called robust tag SNPs) which is able to tolerate missing data. To find robust tag SNPs, we propose two greedy and one LP-relaxation algorithms which give solutions of \((m+1)\ln\frac{K(K-1)}{2}\), \(\ln((m+1)\frac{K(K-1)}{2})\), and O(mln K) approximation respectively, where m is the number of SNPs allowed for missing data and K is the number of patterns in the block.
What problem does this paper attempt to address?