The Statistical Analysis on the Bases in Coding Start Region of E.coli Genes

Liaofu Luo
1998-01-01
Abstract:The 1 288 beginning sequences in gene coding region of Escherichia coli are compiled and the probability of bases in site 1 to site 33 are calculated.We find that the probability distribution of bases in each base site of second and third condon are different from the others;At the beginning of the fourth codon,the probabilities of bases G and base T in each site of codons have distinctly 3 period distribution.The relation between the probability distribution of bases in these sites and gene expression level are studied.The results show that the probability distribution of bases in codon 2 and codon 3 have differences between high and low expressed genes;Its 3 period distribution of base G and T for high expressed genes are more distinct than low expressed genes.
What problem does this paper attempt to address?