Abstract:Background: 'Selection signatures' delimit regions of the genome that are, or have been, functionally important and have therefore been under either natural or artificial selection. In this study, two different and complementary methods--integrated Haplotype Homozygosity Score (|iHS|) and population differentiation index (FST)--were applied to identify traces of decades of intensive artificial selection for traits of economic importance in modern cattle. Results: We scanned the genome of a diverse set of dairy and beef breeds from Germany, Canada and Australia genotyped with a 50 K SNP panel. Across breeds, a total of 109 extreme |iHS| values exceeded the empirical threshold level of 5% with 19, 27, 9, 10 and 17 outliers in Holstein, Brown Swiss, Australian Angus, Hereford and Simmental, respectively. Annotating the regions harboring clustered |iHS| signals revealed a panel of interesting candidate genes like SPATA17, MGAT1, PGRMC2 and ACTC1, COL23A1, MATN2, respectively, in the context of reproduction and muscle formation. In a further step, a new Bayesian FST-based approach was applied with a set of geographically separated populations including Holstein, Brown Swiss, Simmental, North American Angus and Piedmontese for detecting differentiated loci. In total, 127 regions exceeding the 2.5 per cent threshold of the empirical posterior distribution were identified as extremely differentiated. In a substantial number (56 out of 127 cases) the extreme FST values were found to be positioned in poor gene content regions which deviated significantly (p < 0.05) from the expectation assuming a random distribution. However, significant FST values were found in regions of some relevant genes such as SMCP and FGF1. Conclusions: Overall, 236 regions putatively subject to recent positive selection in the cattle genome were detected. Both |iHS| and FST suggested selection in the vicinity of the Sialic acid binding Ig-like lectin 5 gene on BTA18. This region was recently reported to be a major QTL with strong effects on productive life and fertility traits in Holstein cattle. We conclude that high-resolution genome scans of selection signatures can be used to identify genomic regions contributing to within- and inter-breed phenotypic variation.

Detecting Selection in Low-Coverage High-Throughput Sequencing Data using Principal Component Analysis

Detecting Genomic Signatures of Natural Selection with Principal Component Analysis: Application to the 1000 Genomes Data

Application of Partial Least Squares in Exploring the Genome Selection Signatures Between Populations.

Properties of different selection signature statistics and a new strategy for combining them

Subgroup detection in genotype data using invariant coordinate selection

Machine-Learning Prospects for Detecting Selection Signatures Using Population Genomics Data

Using Molecular Data to Detect Selection: Signatures from Recent Single Events

Detecting Recent Positive Selection with High Accuracy and Reliability by Conditional Coalescent Tree.

Power Analysis of Principal Components Regression in Genetic Association Studies.

A Probabilistic Method for Testing and Estimating Selection Differences Between Populations.

A novel expectation-maximization approach to infer general diploid selection from time-series genetic data

Modeling recent positive selection in Americans of European ancestry

Detection of selective sweeps in structured populations: a comparison of recent methods

Detect and Adjust for Population Stratification in Population-Based Association Study Using Genomic Control Markers: an Application of Affymetrix Genechip® Human Mapping 10K Array

Testing for Ancient Selection Using Cross-population Allele Frequency Differentiation

SeleDiff: A fast and scalable tool for estimating and testing selection differences between populations.

Accurate inference of population history in the presence of background selection

Application of site and haplotype-frequency based approaches for detecting selection signatures in cattle

A Population Genetics-Phylogenetics Approach to Inferring Natural Selection in Coding Sequences

A Genome-Scan Method to Identify Selected Loci Appropriate for Both Dominant and Codominant Markers: A Bayesian Perspective

iHDSel software: The Price equation and the population stability index to detect genomic patterns compatible with selective sweeps. An example with SARS-CoV-2.