Dynamic Gene Copy Number Variation in Collinear Regions of Grass Genomes.

Jian-Hong Xu,Jeffrey L. Bennetzen,Joachim Messing
DOI: https://doi.org/10.1093/molbev/msr261
IF: 10.7
2011-01-01
Molecular Biology and Evolution
Abstract:A salient feature of genomes of higher organisms is the birth and death of gene copies. An example is the alpha prolamin genes, which encode seed storage proteins in grasses (Poaceae) and represent a medium-size gene family. To better understand the mechanism, extent, and pace of gene amplification, we compared prolamin gene copies in the genomes of two different tribes in the Panicoideae, the Paniceae and the Andropogoneae. We identified alpha prolamin (setarin) gene copies in the diploid foxtail millet (Paniceae) genome (490 Mb) and compared them with orthologous regions in diploid sorghum (730 Mb) and ancient allotetraploid maize (2,300 Mb) (Andropogoneae). Because sequenced genomes of other subfamilies of Poaceae like rice (389 Mb) (Ehrhartoideae) and Brachypodium (272 Mb) (Pooideae) do not have alpha prolamin genes, their collinear regions can serve as "empty" reference sites. A pattern emerged, where genes were copied and inserted into other chromosomal locations followed by additional tandem duplications (clusters). We observed both recent (species-specific) insertion events and older ones that are shared by these tribes. Many older copies were deleted by unequal crossing over of flanking sequences or damaged by truncations. However, some remain intact with active and inactive alleles. These results indicate that genomes reflect only a snapshot of the gene content of a species and are far less static than conventional genetics has suggested. Nucleotide substitution rates for active alpha prolamins genes were twice as high as for low copy number beta, gamma, and delta prolamin genes, suggesting that gene amplification accelerates the pace of divergence.
What problem does this paper attempt to address?