Using structure prediction of negative sense RNA virus nucleoproteins to assess evolutionary relationships

Kimberly Sabsay,Aartjan J.W. te Velthuis
DOI: https://doi.org/10.1101/2024.02.16.580771
2024-05-22
Abstract:Negative sense RNA viruses (NSV) include some of the most detrimental human pathogens, including the influenza, Ebola and measles viruses. NSV genomes consist of one or multiple single-stranded RNA molecules that are encapsidated into one or more ribonucleoprotein (RNP) complexes. These RNPs consist of viral RNA, a viral RNA polymerase, and many copies of the viral nucleoprotein (NP). Current evolutionary relationships within the NSV phylum are based on alignment of conserved RNA-directed RNA polymerase (RdRp) domain amino acid sequences. However, the RdRp domain-based phylogeny does not address whether NP, the other core protein in the NSV genome, evolved along the same trajectory or whether several RdRp-NP pairs evolved through convergent evolution in the segmented and non-segmented NSV genomes architectures. Addressing how NP and the RdRp domain evolved may help us better understand NSV diversity. Since NP sequences are too short to infer robust phylogenetic relationships, we here used experimentally-obtained and AlphaFold 2.0-predicted NP structures to probe whether evolutionary relationships can be estimated using NSV NP sequences. Following flexible structure alignments of modeled structures, we find that the structural homology of the NSV NPs reveals phylogenetic clusters that are consistent with RdRp-based clustering. In addition, we were able to assign viruses for which RdRp sequences are currently missing to phylogenetic clusters based on the available NP sequence. Both our RdRp-based and NP-based relationships deviate from the current NSV classification of the segmented , which cluster with the other segmented NSVs in our analysis. Overall, our results suggest that the NSV RdRp and NP genes largely evolved along similar trajectories and that even short pieces of genetic, protein-coding information can be used to infer evolutionary relationships, potentially making metagenomic analyses more valuable.
Microbiology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: by predicting the structures of nucleoproteins (NP) of negative - sense single - stranded RNA viruses (NSV), evaluate the evolutionary relationships of these viruses. Specifically, the research aims to answer the following questions: 1. **Does the nucleoprotein (NP) develop along a similar evolutionary trajectory as the RNA - dependent RNA polymerase (RdRp)?** - The current understanding of NSV evolution is mainly based on amino acid sequence alignments of the RdRp conserved domain. However, this RdRp - based method cannot fully reveal the evolutionary history of NP. Therefore, researchers hope to verify whether NP and RdRp co - evolve by analyzing NP structures. 2. **In NSV with different genome structures (segmented and non - segmented), have NP and RdRp undergone convergent evolution?** - Researchers hope to understand whether NP and RdRp in NSV with different genome structures have evolved independently or have acquired similar functional and structural characteristics through convergent evolution. 3. **Can NP structure information be used to infer evolutionary relationships, especially in the absence of RdRp sequences?** - Since NP sequences are short, it is difficult to conduct robust phylogenetic analyses. Researchers attempt to use experimentally obtained NP structure data and NP structures predicted by AlphaFold 2.0 to explore whether evolutionary relationships between NSV can be inferred through structure information. ### Overview of research methods To achieve the above goals, researchers adopted the following methods: - **Structure prediction**: Use AlphaFold 2.0 to predict the three - dimensional structures of NSV nucleoproteins. - **Structure alignment**: Align the predicted NP structures through flexible structure alignment tools (such as FATCAT 2.0). - **Phylogenetic analysis**: Compare the structure alignment results with traditional sequence - based phylogenetic analyses (such as multiple sequence alignment MSA) to evaluate the effectiveness of structure information in inferring evolutionary relationships. ### Main findings 1. **NP structure homology reveals phylogenetic clusters consistent with RdRp**: - The study found that the NP phylogenetic clusters obtained through structure alignment are highly consistent with the RdRp - based phylogenetic clusters, indicating that NP and RdRp may co - evolve. 2. **NP structure information can be used to fill the gaps in RdRp sequence absence**: - For some viruses lacking RdRp sequences, researchers were able to assign them to specific phylogenetic clusters based on NP structures, thus providing a new method for inferring evolutionary relationships. 3. **NP structure alignment provides more evolutionary information than sequence alignment**: - The structure alignment results show that even in the case of short sequences, structure information can provide more detailed evolutionary relationship details, which helps to better understand the diversity of NSV. Overall, this study provides a new perspective for understanding the evolutionary history of NSV and demonstrates the potential of structure information in phylogenetic analysis.