Improving the annotation of the cattle genome by annotating transcription start sites in a diverse set of tissues and populations using Cap Analysis Gene Expression sequencing

Mazdak Salavati,Richard Clark,Doreen Becker,Christa Kühn,Graham Plastow,Sébastien Dupont,Gabriel Costa Monteiro Moreira,Carole Charlier,Emily Louise Clark
DOI: https://doi.org/10.1093/g3journal/jkad108
2023-08-09
Abstract:Understanding the genomic control of tissue-specific gene expression and regulation can help to inform the application of genomic technologies in farm animal breeding programs. The fine mapping of promoters [transcription start sites (TSS)] and enhancers (divergent amplifying segments of the genome local to TSS) in different populations of cattle across a wide diversity of tissues provides information to locate and understand the genomic drivers of breed- and tissue-specific characteristics. To this aim, we used Cap Analysis Gene Expression (CAGE) sequencing, of 24 different tissues from 3 populations of cattle, to define TSS and their coexpressed short-range enhancers (<1 kb) in the ARS-UCD1.2_Btau5.0.1Y reference genome (1000bulls run9) and analyzed tissue and population specificity of expressed promoters. We identified 51,295 TSS and 2,328 TSS-Enhancer regions shared across the 3 populations (dairy, beef-dairy cross, and Canadian Kinsella composite cattle from 2 individuals, 1 of each sex, per population). Cross-species comparative analysis of CAGE data from 7 other species, including sheep, revealed a set of TSS and TSS-Enhancers that were specific to cattle. The CAGE data set will be combined with other transcriptomic information for the same tissues to create a new high-resolution map of transcript diversity across tissues and populations in cattle for the BovReg project. Here we provide the CAGE data set and annotation tracks for TSS and TSS-Enhancers in the cattle genome. This new annotation information will improve our understanding of the drivers of gene expression and regulation in cattle and help to inform the application of genomic technologies in breeding programs.
What problem does this paper attempt to address?