Snv Discovery And Functional Candidate Gene Identification For Milk Composition Based On Whole Genome Resequencing Of Holstein Bulls With Extremely High And Low Breeding Values
Shan Lin,Hongyan Zhang,Yali Hou,Lin Liu,Wenhui Li,Jianping Jiang,Bo Han,Shengli Zhang,Dongxiao Sun
DOI: https://doi.org/10.1371/journal.pone.0220629
IF: 3.7
2019-01-01
PLoS ONE
Abstract:We have sequenced the whole genomes of eight proven Holstein bulls from the four half-sib or full-sib families with extremely high and low estimated breeding values (EBV) for milk protein percentage (PP) and fat percentage (FP) using Illumina re-sequencing technology. Consequently, 2.3 billion raw reads were obtained with an average effective depth of 8.1x. After single nucleotide variant (SNV) calling, total 10,961,243 SNVs were identified, and 57,451 of them showed opposite fixed sites between the bulls with high and low EBVs within each family (called as common differential SNVs). Next, we annotated the common differential SNVs based on the bovine reference genome, and observed that 45,188 SNVs (78.70%) were located in the intergenic region of genes and merely 11,871 SNVs (20.67%) located within the protein-coding genes. Of them, 13,099 common differential SNVs that were within or close to protein-coding genes with less than 5 kb were chosen for identification of candidate genes for milk compositions in dairy cattle. By integrated analysis of the 2,657 genes with the GO terms and pathways related to protein and fat metabolism, and the known quantitative trait loci (QTLs) for milk protein and fat traits, we identified 17 promising candidate genes: ALG14, ATP2C1, PLD1, C3H1orf85, SNX7, MTHFD2L, CDKN2D, COL5A3, FDX1L, PIN1, FIG4, EXOC7, LASP1, PGS1, SAO, GPLD1 and MGEA5. Our findings provided an important foundation for further study and a prompt for molecular breeding of dairy cattle.