Conserved Untranslated Regions of Multipartite Viruses: Natural Markers of Novel Viral Genomic Components and Tags of Viral Evolution
Song Zhang,Caixia Yang,Jiaxing Wu,Yuanjian Qiu,Zhiyou Xuan,Liu Yang,Ruiling Liao,Xiaofei Liang,Haodong Yu,Fang Ren,Yafeng Dong,Xiaoying Xie,Yanhong Han,Di Wu,Pedro Luis Ramos-González,Juliana Freitas-Astúa,Changyong Zhou,Mengji Cao
DOI: https://doi.org/10.1101/2022.01.16.476546
IF: 5.614
2022-01-01
Virus Evolution
Abstract:Viruses with split genomes are categorized as being either segmented or multipartite according to whether their genomic segments occur in single or different virions. Some complexity will exist, in that inherited “core” vital segments viruses may renew the others once host and environmental alterations keep driving viral evolution. Despite this uncertainty, empirical observations have been made across the split genomes in the untranslated regions (UTRs) on the short or long stretches of conserved or identical sequences. In this study, we describe a methodology that combines RNA and small RNA sequencing, conventional BLASTx, and iterative BLASTn of UTRs to detect viral genomic components even if they encode orphan genes (ORFans). Within the phylum Kitrinoviricota, novel putative multipartite viruses and viral genomic components were annotated using data obtained from our sampling or publicly available sources. The novel viruses, as extensions or intermediate nodes, enriched the information of the evolutionary networks. Furthermore, the diversity of novel genomic components emphasized the evolutionary roles of reassortment and recombination, as well as genetic deletion, strongly supporting the genomic complexity. These data also suggest insufficient knowledge of these genomic components for categorizing some extant viral taxa. The relative conservation of UTRs at the genome level may explain the relationships between monopartite and multipartite viruses and how the multipartite viruses can have a life strategy involving multiple host cells.
Author summary The current workflows for virus identification are largely based on high-throughput sequencing and coupled protein sequence homology-dependent analysis methods and tools. However, for viruses with split genomes, the identification of genomic components whose deduced protein sequences are not homologous to known sequences is inadequate. Furthermore, many plant-infecting multipartite viruses contain conserved UTRs across their genomic components. Based on this, we propose a practical method of UTR-backed iterative BLASTn (UTR-iBLASTn) to explore the components with ORFans and study virus evolution using the UTRs as signals. These shed light on viral “dark matter”—unknown/omitted genomic components of segmented/multipartite viruses from different kingdoms and hosts, and the origins of these components.
### Competing Interest Statement
The authors have declared no competing interest.