Messenger RNA Information: Its Implication in Protein Structure Determination and Others

liaofu luo,mengwen jia
DOI: https://doi.org/10.1007/978-1-84628-780-0_14
2007-01-01
Abstract:Three problems on mRNA information in protein-coding regions are discussed: first, how the mRNA sequence information (tRNA gene copy number) is related to protein secondary structure; second, how the mRNA structure information (stem/loop content) is related to protein secondary structure; third, how the specific selection for mRNA folding energy is made among genomes. From statistical analyses of protein sequences for humans and E. coli we have found that the m-codon segments (for m = 2 to 6) with averagely high tRNA copy number (TCN) (larger than 10.5 for humans or 1.95 for E. coh) preferably code for the alpha helix and that with low TCN (smaller than 7.5 for humans or 1.7 for E. coli) preferably code for the coil. Between them there is an intermediate region without structure preference. In the meantime, we have demonstrated that the helices and strands on proteins tend to be preferably "coded" by the mRNA stem region, while the coil on proteins tends to be preferably "coded" by the mRNA loop region. The occurrence frequencies of stems in helix and strand fragments have attained 6 standard deviations more than the expected. The relation between mRNA stem/loop content and protein structure can be seen from the point of mRNA folding energy. Both for E. coli and humans, the mRNA folding energy in protein regular structure is statistically lower than that in randomized sequence, but for irregular structure (coil) the Z scores are near their control values. We also have studied the folding energy of native mRNA sequence in 28 genomes from a broad view. By use of the analysis of covariance, taking the covariable G+C content or base correlation into account, we demonstrate that the intraspecific difference of the mRNA folding free energy is much smaller than the interspecific difference. The distinction between intraspecific homogeneity and interspecific inhomogeneity is extremely significant (p < .0001). This means the selection for local mRNA structure is specific among genomes. The high intraspecific homogeneity of mRNA folding energy as compared with its large interspecific inhomogeneity can be explained by concerted evolution. The above result also holds for the folding energy of native mRNA relative to randomized sequences. This means the robustness of the distinction between intraspecific homogeneity and interspecific inhomogeneity of mRNA folding under the perturbation of sequential and structural variation.
What problem does this paper attempt to address?