Different Patterns of Codon Usage and Amino Acid Composition across Primate Lentiviruses

Angelo Pavesi,Fabio Romerio
DOI: https://doi.org/10.3390/v15071580
2023-07-20
Viruses
Abstract:A common feature of the mammalian Lentiviruses (family Retroviridae) is an RNA genome that contains an extremely high frequency of adenine (31.7–38.2%) while being extremely poor in cytosine (13.9–21.2%). Such a biased nucleotide composition has implications for codon usage, causing a striking difference between the frequency of synonymous codons in Lentiviruses and that in their hosts. To test whether primate Lentiviruses present differences in codon and amino acid composition, we assembled a dataset of genome sequences that includes SIV species infecting Old-World monkeys and African apes, HIV-2, and the four groups of HIV-1. Using principal component analysis, we found that HIV-1 shows a significant enrichment in adenine plus thymine in the third synonymous codon position and in adenine and guanine in the first and second nonsynonymous codon positions. Similarly, we observed an enrichment in adenine and in guanine in nonsynonymous first and second codon positions, which affects the amino acid composition of the proteins Gag, Pol, Vif, Vpr, Tat, Rev, Env, and Nef. This result suggests an effect of natural selection in shaping codon usage. Under the hypothesis that the use of synonyms in HIV-1 could reflect adaptation to that of genes expressed in specific cell types, we found a highly significant correlation between codon usage in HIV-1 and monocytes, which was remarkably higher than that with B and T lymphocytes. This finding is in line with the notion that monocytes represent an HIV-1 reservoir in infected patients, and it could help understand how this reservoir is established and maintained.
virology
What problem does this paper attempt to address?