High-throughput sequencing-based microsatellite genotyping for polyploids to resolve allele dosage uncertainty and improve analyses of genetic diversity, structure and differentiation: A case study of the hexaploid Camellia oleifera

Xiangyan Cui,Caihua Li,Shengyuan Qin,Zebin Huang,Bin Gan,Zhengwen Jiang,Xiaomao Huang,Xiaoqiang Yang,Qin Li,Xiaoguo Xiang,Jiakuan Chen,Yao Zhao,Jun Rong
DOI: https://doi.org/10.1111/1755-0998.13469
Abstract:Conventional microsatellite (simple sequence repeat, SSR) genotyping methods cannot accurately identify polyploid genotypes leading to allele dosage uncertainty, introducing biases in population genetic analysis. Here, a new SSR genotyping method was developed to directly infer accurate polyploid genotypes. The frequency distribution of SSR sequences was obtained based on deep-coverage high-throughput sequencing data. Corrections were performed accounting for the "stutter peak" and amplification efficiency of SSR sequences. Perl scripts and an online SSR genotyping tool "SSRSeq" were provided to process the sequencing data and output genotypes with corrected allele dosages. Hexaploid Camellia oleifera is the dominant woody oilseed crop in China. Understanding the geographical pattern of genetic variation in wild C. oleifera is essential for the conservation and utilization of genetic resources. Six wild C. oleifera populations were sampled across geographical ranges in subtropical evergreen broadleaf forests of China. Using 35 SSR markers, the high-throughput sequencing-based SSRSeq method was applied to obtain accurate hexaploid genotypes of wild C. oleifera. The results demonstrated that the new method could resolve allele dosage uncertainty and considerably improve genetic diversity, structure and differentiation analyses for polyploids. The genetic variation patterns of wild C. oleifera across geographical ranges agree with the "central-marginal hypothesis", stating that genetic diversity is high in the central population and declines from the central to the peripheral populations, and genetic differentiation increases from the centre to the periphery. This method and findings can facilitate the utilization of wild C. oleifera genetic resources for the breeding of cultivated C. oleifera.
What problem does this paper attempt to address?