Exploring repeats in rice genomes: Identification, Characterization and its Applications

Gourab Das,Indira Ghosh
DOI: https://doi.org/10.1101/2022.01.24.477639
2022-01-25
Abstract:Biodiversity is a fundamental property of all natural systems existing in the field biology. It refers to the underlying heterogeneity at different levels of ecology, genetics and evolution. In case of plant systems, dramatic variability has been observed during the Anthropocene at different spatial scales. Environmental stress is one of the major influencing factors behind this plant biodiversity. Huge genetic diversity has been also demonstrated across varieties of important crop species like rice. Repetitive sequences which are a major contributor of genomic diversity in polyploidy plants have been found to occur ubiquitously in their genomes. To date diverse repeat types have been characterized in the plant genomes performing various functions starting from qualitative trait markers to genome evolution and stress management. With an objective to identify of plant stress associated genes using DNA repeat probes, a robust method has been developed. The method has been modularized into three distinct sections. First part is dedicated for identification of different types of repeats. Earlier review has suggested building a pipeline of multiples tools for capturing different types of repeats from the genome sequences. Specialized tools like TROLL, Tandem repeat finder (TRF), PHOBOS and database like REPBASE have been selected for performing this job. Second module is intended for screening of stress related genes from the published articles and databases and the last module has been designed for the association mining between genomic repetitive patterns with stress phenotypes. The method has been used to explore stress associated repeats from 9 Oryza species from different continents and other plants like Arabidopsis and Brachypodium. In case of Oryza species distribution repeats has been found to be significantly different between stress associated and housekeeping genes. More than 55% of the repeats are found to be in positive association (nPMI > 0) whereas 26% of the repeats are false-positives. These repetitive probes have been utilized in several applications. Firstly, using as molecular markers to identify stress related genes in different Oryza species where availability is limited. Secondly, using as a probe to reanalyzing the evolutionary lineage of Oryza species etc.
What problem does this paper attempt to address?