Correlation Between GC-content and Palindromes in Randomly Generated Sequences and Viral Genomes

Andrew Ninh
DOI: https://doi.org/10.48550/arXiv.1302.5869
IF: 4.31
2013-02-24
Genomics
Abstract:GC-content, the ratio of guanine and cytosine bases in an entire nucleotide sequence, and palindromic sequences are unique for every organism due to genomic evolution. The goals of our research was to establish a correlation between GC-content and palindromic densities in wild-type viral and randomly-generated genomes. Forty viral genomes were downloaded from GenBank and their GC-ratios and palindromic densities were calculated and plotted using Mathematica. The palindromic densities-by-GC-ratios plot of randomly generated sequences (palindromic density curve) exhibited a quadratic relationship and was superimposed over the viral genome plot. It was observed that the viral plots followed the curvature of the random sequences' quadratic curve, signifying a directly proportional relationship between GC-content and palindrome density in viral genomes. However, because viral genomes require certain non-palindromic sequences to function, the palindromic densities of most wild-type genomes were under the palindromic density curve. The variance in palindrome densities of wild-type genomes in respect to the random sequences' quadratic curve may be examined to determine evolutionary traits in genomes. A better understanding of viral palindromic densities and GC-ratios would help in understanding conserved secondary RNA structures in viral genomes and future drug discovery. In addition, certain viral genomes were found to be viable recombinant viruses, which are used in gene therapy.
What problem does this paper attempt to address?