Densities, length proportions, and other distributional features of repetitive sequences in the human genome estimated from 430 megabases of genomic sequence

Z Gu,H Wang,A Nekrutenko,W H Li
DOI: https://doi.org/10.1016/S0378-1119(00)00434-0
IF: 3.913
2000-01-01
Gene
Abstract:The densities of repetitive elements in the human genome were calculated in each GC content class using non-overlapping windows of 50kb. The density of Alu is two to three times higher in GC-rich regions than in AT-rich regions, while the opposite is true for LINE1. In contrast, LINE2 and other elements, such as DNA transposons, are more uniformly distributed in the genome. The number of Alus in the human genome was estimated to be 1.4 million, higher than previous estimates. About 40% of the autosomes and ∼51% of the X and Y chromosomes are occupied by repetitive elements. In total, the human genome is estimated to contain more than 4 million repetitive elements. The GC contents (%) of repetitive elements and their flanking regions were also calculated. The GC contents of almost all kinds of repeats are positively correlated with the window GC contents, suggesting that a repetitive sequence is subject to the same mutation pressure as its surrounding regions, so it tends to have the same GC content as its surrounding regions. This observation supports the regional mutation hypothesis. The only two exceptions are AluYa and AluYb8, the two youngest Alu subfamilies. The GC content of AluYb8 is negatively correlated with that of its surrounding regions, while AluYa shows no correlation, suggesting different insertion patterns for these two young Alu subfamilies. This suggestion was supported by the fact that the average genetic distance between members of AluYb8 in each GC window class is positively correlated with the GC content of the window, but no correlation was found for AluYa. AluYa is more frequent in Y chromosome than in other chromosomes; the same is true for LTR retroviruses. This pattern might be correlated with the evolutionary history of Y chromosome.
What problem does this paper attempt to address?