Identification of novel genes in cattle ( ) and biological insights into their function in embryo development

Gustavo P. Schettini,Michael Morozyuk,Fernando H. Biase
DOI: https://doi.org/10.1101/2024.03.15.585311
2024-03-17
Abstract:Appropriate regulation of genes expressed in oocytes and embryos is essential for acquisition of developmental competence in mammals. Here, we hypothesized that several genes expressed in oocytes and pre-implantation embryos remain unknown. Our goal was to reconstruct the transcriptome of oocytes (germinal vesicle and metaphase II) and pre-implantation cattle embryos (blastocysts) using short-read and long-read sequences to identify putative new genes. We identified 274,342 transcript sequences, and 3,033 of those transcripts do not match a gene present in an annotation, thus are potential new genes. Notably, 63.67% (1,931/3,033) of potential novel genes exhibited coding potential. Also noteworthy, 97.92% of the putative novel genes overlapped annotation with transposable elements. Comparative analysis of transcript abundance identified that 1,840 novel genes (recently added to the annotation) or potential new genes were differentially expressed between developmental stages (FDR<0.01). We also determined that 522 novel or potential new genes (448 and 34 respectively) were upregulated at eight-cell embryos compared to oocytes (FDR<0.01). In eight-cell embryos, 102 novel or putative new genes were co-expressed (|r|>0.85, P<1×10 ) with several genes annotated with gene ontology processes related to pluripotency maintenance and embryo development. CRISPR-Cas9 genome editing confirmed that the disruption of one of the novel genes highly expressed in eight-cell embryos reduced blastocyst development (ENSBTAG00000068261, P=1.55×10 ). In conclusion, our results revealed several putative new genes that need careful annotation. Many of the putative new genes have dynamic regulation during pre-implantation development and are important components of gene regulatory networks involved in pluripotency and blastocyst formation.
Genomics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to identify new genes in bovine (Bos taurus) oocytes and early embryos, and to explore the biological functions of these new genes in embryonic development. Specifically, the researchers hypothesized that there are undiscovered gene expressions in oocytes and pre - implantation embryos. To verify this hypothesis, they used short - read and long - read sequencing technologies to reconstruct the transcriptomes of oocytes (including germinal vesicle stage and metaphase II of meiosis) and pre - implantation bovine embryos (blastocyst stage) in order to identify potential new genes. ### Main research objectives: 1. **Reconstruct transcriptomes**: Reconstruct the transcriptomes of oocytes and early embryos by combining short - read and long - read sequencing data. 2. **Identify new genes**: Identify potential new genes in the reconstructed transcriptomes that are not included in the existing annotation databases. 3. **Evaluate the functions of new genes**: Evaluate their potential functions in embryonic development by analyzing the expression patterns of these new genes at different developmental stages and their co - expression relationships with other known genes. ### Research background: - **Importance of gene expression**: Appropriate gene expression regulation is crucial for mammalian oocytes and embryos to acquire developmental competence. - **Limitations of existing research**: Although high - throughput sequencing technologies have revealed many gene expression differences, a large number of genes remain undiscovered or unannotated. - **Technical advantages**: Long - read sequencing technology can detect full - length transcripts, while short - read sequencing technology provides high - precision sequence data. Combining these two technologies can more comprehensively reconstruct the transcriptome. ### Research methods: - **Sample collection**: Bovine ovaries were collected from commercial slaughterhouses, and oocytes and early embryos were isolated. - **Transcriptome sequencing**: Samples were sequenced using short - read and long - read sequencing technologies. - **Data analysis**: The transcriptome was reconstructed by bioinformatics methods and compared with the existing gene annotation databases to identify potential new genes. - **Functional analysis**: The expression patterns of new genes at different developmental stages and their biological functions were evaluated by differential expression analysis and co - expression network analysis. ### Main findings: - **Identification of new genes**: The researchers identified 274,342 transcript sequences, of which 3,033 transcripts did not match any genes in the existing annotation databases and were considered as potential new genes. - **Coding potential**: 63.67% of the potential new genes had coding potential, while 36.33% might be long non - coding RNAs. - **Transposon association**: 97.92% of the potential new genes overlapped with transposon elements. - **Expression patterns**: 1,840 new or potential new genes showed differential expression at different developmental stages. - **Functional verification**: CRISPR - Cas9 gene - editing experiments showed that the knockout of a new gene (ENSBTAG00000068261) highly expressed in eight - cell embryos led to a significant decrease in blastocyst development rate. ### Conclusions: - **Importance of new genes**: The research results revealed multiple potential new genes, which play important regulatory roles in the early embryonic development process, especially in blastocyst formation and pluripotency maintenance. - **Future directions**: Further research is required to perform detailed annotation and functional verification of these new genes in order to better understand their specific roles in embryonic development. In conclusion, this study not only expands the understanding of the bovine genome but also provides new insights into the complex gene regulatory network of early embryonic development.