Identification, characterization and transcriptional analysis of the long non-coding RNA landscape in the family

Pascual Villalba-Bermell,Joan Marquez-Molins,Gustavo Gomez
DOI: https://doi.org/10.1101/2024.01.12.575433
2024-01-15
Abstract:Long non-coding RNAs (lncRNAs) constitute a fascinating class of regulatory RNAs, widely distributed in eukaryotes. In plants, they exhibit features such as tissue-specific expression, spatiotemporal regulation, and responsiveness to stress, suggesting their involvement in specific biological processes. Although an increasing number of studies support the regulatory role of lncRNAs in model plants, our knowledge about these transcripts in relevant crops is limited. In this study we employ a custom pipeline on a dataset of over 1,000 RNA-seq studies across nine representative species of the family to predict 91,209 non-redundant lncRNAs. LncRNAs were predicted according to three confidence levels and classified into intergenic, natural antisense, intronic, and sense overlapping. Predicted lncRNAs have lower expression levels compared to protein-coding genes but a more specific behavior when considering plant tissues, developmental stages, and response to stress, emphasizing their potential roles in regulating various aspects of plant-biology. The evolutionary analysis indicates higher positional conservation than sequence conservation, which may be linked to the presence of conserved modular motifs within syntenic lncRNAs. In short, this research provides a comprehensive map of lncRNAs in the agriculturally relevant family, offering a valuable resource for future investigations in crop improvement.
Bioinformatics
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve several key problems in the identification, feature analysis and transcriptomics research of long non - coding RNAs (lncRNAs) in Cucurbitaceae plants: 1. **Comprehensive identification of lncRNAs**: - Researchers used a customized bioinformatics pipeline to analyze more than 1,000 RNA - seq datasets to predict lncRNAs in nine representative species of Cucurbitaceae. Eventually, 91,209 non - redundant lncRNAs were identified. 2. **Classification and feature analysis of lncRNAs**: - According to the genomic location, orientation and the situation of adjacent protein - coding genes, these lncRNAs were classified into four categories: intergenic lncRNAs (lincRNAs), natural antisense lncRNAs (NAT - lncRNAs), intronic lncRNAs (int - lncRNAs) and sense overlapping lncRNAs (SOT - lncRNAs). - The distribution of different types of lncRNAs on the genome was analyzed, and it was found that lincRNAs were the predominant type, accounting for 64.5% of the total. 3. **Expression patterns of lncRNAs**: - The expression patterns of lncRNAs in different tissues, developmental stages and stress conditions were studied, and it was found that their expression levels were generally lower than those of protein - coding genes, but showed higher specificity in specific tissues and conditions. 4. **Evolutionary conservation of lncRNAs**: - The evolutionary conservation of lncRNAs among different Cucurbitaceae species was explored through sequence similarity and collinearity analysis. The results showed that the positional conservation (i.e., collinearity) was significantly higher than the sequence conservation. 5. **Functional potential of lncRNAs**: - The study emphasized the potential roles of lncRNAs in regulating plant biological processes, such as development, stress response, genomic stability, etc. These findings provide important resources for using lncRNAs as potential biomarkers or traits for crop improvement in the future. ### Summary This paper reveals the diversity and molecular characteristics of this emerging regulatory factor in Cucurbitaceae plants through comprehensive identification, classification, feature analysis and evolutionary conservation research of lncRNAs in Cucurbitaceae plants, providing valuable resources for future research and crop improvement.