Abstract:Abstract Background For genomic selection in populations with a small reference population, combining populations of the same breed or populations of related breeds is an effective way to increase the size of the reference population. However, genomic predictions based on single nucleotide polymorphism (SNP)-chip genotype data using combined populations with different genetic backgrounds or from different breeds have not shown a clear advantage over using within-population or within-breed predictions. The increasing availability of whole-genome sequencing (WGS) data provides new opportunities for combined population genomic prediction. Our objective was to investigate the accuracy of genomic prediction using imputation-based WGS data from combined populations in pigs. Using 80K SNP panel genotypes, WGS genotypes, or genotypes on WGS variants that were pruned based on linkage disequilibrium (LD), three methods [genomic best linear unbiased prediction (GBLUP), single-step (ss)GBLUP, and genomic feature (GF)BLUP] were implemented with different prior information to identify the best method to improve the accuracy of genomic prediction for combined populations in pigs. Results In total, 2089 and 2043 individuals with production and reproduction phenotypes, respectively, from three Yorkshire populations with different genetic backgrounds were genotyped with the PorcineSNP80 panel. Imputation accuracy from 80K to WGS variants reached 92%. The results showed that use of the WGS data compared to the 80K SNP panel did not increase the accuracy of genomic prediction in a single population, but using WGS data with LD pruning and GFBLUP with prior information did yield higher accuracy than the 80K SNP panel. For the 80K SNP panel genotypes, using the combined population resulted in a slight improvement, no change, or even a slight decrease in accuracy in comparison with the single population for GBLUP and ssGBLUP, while accuracy increased by 1 to 2.4% when using WGS data. Notably, the GFBLUP method did not perform well for both the combined population and the single populations. Conclusions The use of WGS data was beneficial for combined population genomic prediction. Simply increasing the number of SNPs to the WGS level did not increase accuracy for a single population, while using pruned WGS data based on LD and GFBLUP with prior information could yield higher accuracy than the 80K SNP panel.

Best practices for analyzing imputed genotypes from low-pass sequencing in dogs

The Construction of a Haplotype Reference Panel Using Extremely Low Coverage Whole Genome Sequences and Its Application in Genome-Wide Association Studies and Genomic Prediction in Duroc Pigs.

Strategies for Obtaining and Pruning Imputed Whole-Genome Sequence Data for Genomic Prediction

Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals

The Efficient Phasing and Imputation Pipeline of Low‐coverage Whole Genome Sequencing Data Using a High‐quality and Publicly Available Reference Panel in Cattle

Imputation of ancient canid genomes reveals inbreeding history over the past 10,000 years

A New Genotype Imputation Method with Tolerance to High Missing Rate and Rare Variants

Using imputation-based whole-genome sequencing data to improve the accuracy of genomic prediction for combined populations in pigs

On Combining Reference Data to Improve Imputation Accuracy

Genotyping by Genome Reducing and Sequencing for Outbred Animals.

Assessment of the performance of different imputation methods for low-coverage sequencing in Holstein cattle

Comparison of different imputation methods from low- to high-density panels using Chinese Holstein cattle

Large-scale Genotyping of Complex DNA

Integration of Ssgwas and ROH Analyses for Uncovering Genetic Variants Associated with Reproduction Traits in Large White Pigs.

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Performance of Genotype Imputation for Low Frequency and Rare Variants from the 1000 Genomes

A map of canine sequence variation relative to a Greenland wolf outgroup

A beginner's guide to low‐coverage whole genome sequencing for population genomics

DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly

Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation

A Novel Efficient Algorithm for Common Variants Genotyping from Low-Coverage Sequencing Data