Functional genomics of cattle through integration of multi-omics data
Hamid Beiki,Brenda M Murdoch,Carissa A Park,Chandlar Kern,Denise Kontechy,Gabrielle Becker,Gonzalo Rincon,Honglin Jiang,Huaijun Zhou,Jacob Thorne,James E Koltes,Jennifer J Michal,Kimberly Davenport,Monique Rijnkels,Pablo J Ross,Rui Hu,Sarah Corum,Stephanie McKay,Timothy P.L Smith,Wansheng Liu,Wenzhi Ma,Xiaohui Zhang,Xiaoqing Xu,Xuelei Han,Zhihua Jiang,Zhi-Liang Hu,James M Reecy
DOI: https://doi.org/10.1101/2022.10.05.510963
2022-10-07
bioRxiv
Abstract:Functional annotation of the bovine genome was performed by characterizing the spectrum of RNA transcription using a multi-omics approach, combining long- and short-read transcript sequencing and orthogonal data to identify promoters and enhancers and to determine boundaries of open chromatin. A total number of 171,985 unique transcripts (50% protein-coding) representing 35,150 unique genes (64% protein-coding) were identified across tissues. Among them, 159,033 transcripts (92% of the total) were structurally validated by independent datasets such as PacBio Iso-seq, ONT-seq, de novo assembled transcripts from RNA-seq, or Ensembl and NCBI gene sets. In addition, all transcripts were supported by extensive independent data from different technologies such as WTTS-seq, RAMPAGE, ChIP-seq, and ATAC-seq. A large proportion of identified transcripts (69%) were novel, of which 87% were produced by known genes and 13% by novel genes. A median of two 5' untranslated regions was detected per gene, an increase from Ensembl and NCBI annotations (single). Around 50% of protein-coding genes in each tissue were bifunctional and transcribed both coding and noncoding isoforms. Furthermore, we identified 3,744 genes that functioned as non-coding genes in fetal tissues, but as protein-coding genes in adult tissues. Our new bovine genome annotation extended more than 11,000 known gene borders compared to Ensembl or NCBI annotations. The resulting bovine transcriptome was integrated with publicly available QTL data to study tissue-tissue interconnection involved in different traits and construct the first bovine trait similarity network. These validated results show significant improvement over current bovine genome annotations.