Dynamic programming algorithms for haplotype block partitioning: applications to human chromosome 21 haplotype data

Kui Zhang,Fengzhu Sun,Michael S. Waterman,Ting Chen
DOI: https://doi.org/10.1145/640075.640119
2003-01-01
Abstract:Recent studies have shown that the human genome has a haplotype block structure such that it can be divided into discrete blocks of limited haplotype diversity. Patil et al. [6] and Zhang et al. [12] developed algorithms to partition haplotypes into blocks with minimum number of tag SNPs for the entire chromosome. However, it is not clear how to partition haplotypes into blocks with restricted number of SNPs when only limited resources are available. In this paper, we first formulated this problem as finding a block partition with a fixed number of tag SNPs that can cover the maximal percentage of a genome. Then we solved it by two dynamic programming algorithms, which are fairly flexible to take into account the knowledge of functional polymorphism. We applied our algorithms to the published SNP data of human chromosome 21 combining with the functional information of these SNPs and demonstrated the effectiveness of them. Statistical investigation of the relationship between the starting points of a block partition and the coding and non-coding regions illuminated that the SNPs at these starting points are not significantly enriched in coding regions. We also developed an efficient algorithm to find all possible long local maximal haplotypes across a subset of samples. After applying this algorithm to the human chromosome 21 haplotype data, we found that samples with long local haplotypes are not necessarily globally similar.
What problem does this paper attempt to address?