Sequence Analysis Based Adaptive Hierarchical Clustering Approach for Admixture Population Structure Inference

Jun Wang,Xiaoyan Liu
DOI: https://doi.org/10.1007/978-3-642-28744-2_82
2012-01-01
Journal of Information and Computational Science
Abstract:Population structure inference is an important problem in many areas of human genetics. However, it is very difficult to infer the structure of the admixture population. The traditional Bayesian methods are often time-consuming and may run into convergence problem. Thus, we propose a novel approach to rapidly infer the admixture population stratification on genotype data. The cost of inference can be reduced and the noises can be eliminated by feature selection step. The genetic distance between two individuals is calculated through a sequence analysis algorithm and the distance matrix is used in an adaptive hierarchical clustering algorithm to infer the population structure. Compared with the software based on Bayesian methods (e.g., STRUCTURE), our approach has more efficient computations and the obtained stratification of admixture population is more accurate.
What problem does this paper attempt to address?