Systematic Analysis of Intron Size and Abundance Parameters in Diverse Lineages.
Wu JiaYan,Xiao JingFa,Wang LingPing,Zhong Jun,Yin HongYan,Wu ShuangXiu,Zhang,Yu Jun
DOI: https://doi.org/10.1007/s11427-013-4540-y
2013-01-01
Abstract:All eukaryotic genomes have genes with introns in variable sizes. As far as spliceosomal introns are concerned, there are at least three basic parameters to stratify introns across diverse eukaryotic taxa: size, number, and sequence context. The number parameter is highly variable in lower eukaryotes, especially among protozoan and fungal species, which ranges from less than 4% to 78% of the genes. Over greater evolutionary time scales, the number parameter undoubtedly increases as observed in higher plants and higher vertebrates, reaching greater than 12.5 exons per gene in average among mammalian genomes. The size parameter is more complex, where multiple modes appear at work. Aside from intronless genes, there are three other types of intron-containing genes: half-sized, minimal, and size-expandable introns. The half-sized introns have only been found in a limited number of genomes among protozoan and fungal lineages and the other two types are prevalent in all animal and plant genomes. Among the size-expandable introns, the sizes of plant introns are expansion-limited in that the large introns exceeding 1000 bp are fewer in numbers and transposon-free as compared to the large introns among animals, where the larger introns are filled with transposable elements and appear expansion-flexible, reaching several kilobasepairs (kbp) and even thousands of kbp in size. Most of the intron parameters can be studied as signatures of the specific splicing machineries of different eukaryotic lineages and are highly relevant to the regulation of gene expression and functionality. In particular, the transcription-splicing-export coupling of eukaryotic intron dispensing leads to a working hypothesis that all intron parameters are evolved to be efficient and function-related in processing and routing the spliced transcripts.