Comprehensive analysis of simple sequence repeats in pre-miRNAs.
Ming Chen,Zhongyang Tan,Guangming Zeng,Jun Peng
DOI: https://doi.org/10.1093/molbev/msq100
IF: 10.7
2010-01-01
Molecular Biology and Evolution
Abstract:Simple sequence repeats (SSRs) are tandem repeat units of 1-6 bp that are identified in various complete sequences. However, the distribution, nature, and origination of SSRs in pre-miRNAs, which are characteristic stem-loop sequences and are finally processed into similar to 22 nt functional miRNAs contributing to regulate several biological processes, are still not well studied. The availability of large numbers of pre-miRNAs makes it possible to analyze and compare the occurrences of SSRs, the relative count of SSRs, or the longest SSRs in pre-miRNAs. In this study, we analyzed SSRs in 8,619 pre-miRNAs from 87 species, including Arthropoda, Nematoda, Platyhelminthes, Urochordata, Vertebrata, Mycetozoa, Protistae, Viridiplantae, and Viruses. We find that SSRs widely exist in the pre-miRNAs analyzed. Our analysis shows that mononucleotide repeats are the most abundant repeats, followed by dinucleotide repeats, whereas tri-, tetra-, penta-, and hexanucleotide repeats rarely occurred in pre-miRNAs. The number of SSRs per pre-miRNA on average ranges from 4.1 for viruses to 13.5 for Mycetozoa. Our results confirm that the number of repeats correlates inversely to the length of repeats. Generally, in each taxonomic group, the occurrence and relative count of SSRs decrease with the increase of repeat unit. SSRs do not exhibit obvious preference for special location in pre-miRNAs. The repeats in pre-miRNAs are complementary to repeats in coding or noncoding regions of genomes, and no significant difference is observed between these two classes with respect to the occurrence of repeats. These data on SSRs may become a useful resource of pre-miRNAs, and their possible functions are discussed.