De Novo TranscriptomeAnalysis Reveals Novel Insights into Secondary Metabolite Biosynthesis in (Burm. f) Merrill

Anamika Gupta,Deeksha Singh,Rajiv Ranjan
DOI: https://doi.org/10.1101/2024.03.05.583525
2024-03-11
Abstract:has been widely recognized for its therapeutic properties in traditional Indian medicine. Although its bioactive compounds are used extensively to treat a wide range of ailments, a comprehensive understanding of their genetic basis remains limited. In this study, we conducted a transcriptomic analysis of leaf and root using the Illumina platform. High-quality RNA was isolated, and cDNA libraries were constructed for sequencing, generating4.67 GB and5.51 GB of data for leaf and root samples, respectively. 72,795 unigenes and 24,470 coding sequences (CDS) were predicted based on de novo assembly of reads, revealing a complex transcriptome landscape. Functional annotation and pathway analysis revealed biological processes and pathways associated with . Based on the Gene Ontology (GO) mapping, the CDS was categorized into biological processes, cellular components, and molecular functions. An analysis of pathways using the KEGG database revealed involvement in critical metabolic pathways. Furthermore, SSRs contributed to the understanding of genetic diversity by identifying simple sequence repeats. In addition, differential gene expression analysis identified genes involved in secondary metabolite synthesis, among other physiological processes. The qRT-PCR validation of selected genes confirmed their differential expression profiles, with roots exhibiting higher expression than leaves. In this study, transcriptomics is conducted for the first time for , which may be useful for future molecular research. The detailed findings help us understand biology, which can be used in biotechnology, and they also show how important it is to protect this species because it is used in medicine.
Bioinformatics
What problem does this paper attempt to address?
This preprint paper appears to involve bioinformatics analysis, particularly in the field of genomics and transcriptomics. It describes the transcriptome assembly results of a specific biological sample, including a total of 87,340 transcripts with a combined length of 43,772,184 base pairs. The longest transcript is 8,554 base pairs in length, and the N50 is 694 base pairs. Additionally, the study identified 72,795 single genes (unigenes) with a total length of 35,451,036 base pairs, and analyzed the coding sequences (CDS) of these genes. The analysis found 24,470 CDS, with a majority having homologs found in different databases such as NR, KOG, Pfam, Uniprot, and TF. The paper further explores the categorization of various biological pathways, including metabolic pathways, genetic information processing, environmental information processing, cellular processes, and organismal systems. This suggests that the study may be focused on understanding the roles of these gene functions in specific biological processes. Additionally, the paper includes a primer list for validating selected gene expression, which is commonly used in experimental verification of changes in gene expression levels. Therefore, the problem this paper attempts to address is likely to understand the biological functions and potential metabolic activities through in-depth analysis of the transcriptome of a specific biological sample, identifying key genes and pathways. Experimental validation is used to confirm the functions and expression patterns of these genes.