Distinct mutations and lineages of SARS‐CoV‐2 virus in the early phase of COVID‐19 pandemic and subsequent 1‐year global expansion

Yan Chen,Shiyong Li,Wei Wu,Shuaipeng Geng,Mao Mao
DOI: https://doi.org/10.1002/jmv.27580
IF: 20.693
2022-01-18
Journal of Medical Virology
Abstract:A novel coronavirus, SARS-CoV-2, has caused over 274 million cases and over 5.3 million deaths worldwide since it occurred in December 2019 in Wuhan, China. Here we conceptualized the temporospatial evolutionary and expansion dynamics of SARS-CoV-2 by taking a series of the cross-sectional view of viral genomes from early outbreak in January 2020 in Wuhan to the early phase of global ignition in early April, and finally to the subsequent global expansion by late December 2020. Based on the phylogenetic analysis of the early patients in Wuhan, Wuhan/WH04/2020 is supposed to be a more appropriate reference genome of SARS-CoV-2, instead of the first sequenced genome Wuhan-Hu-1. By scrutinizing the cases from the very early outbreak, we found a viral genotype from the Seafood Market in Wuhan featured with two concurrent mutations (i.e., M type) had become the overwhelmingly dominant genotype (95.3%) of the pandemic 1 year later. By analyzing 4013 SARS-CoV-2 genomes from different continents by early April, we were able to interrogate the viral genomic composition dynamics of the initial phase of global ignition over a time span of 14 weeks. Eleven major viral genotypes with unique geographic distributions were also identified. WE1 type, a descendant of M and predominantly witnessed in western Europe, consisted of half of all the cases (50.2%) at the time. The mutations of major genotypes at the same hierarchical level were mutually exclusive, which implies that various genotypes bearing the specific mutations were propagated during human-to-human transmission, not by accumulating hot-spot mutations during the replication of individual viral genomes. As the pandemic was unfolding, we also used the same approach to analyze 261 323 SARS-CoV-2 genomes from the world since the outbreak in Wuhan (i.e., including all the publicly available viral genomes) to recapitulate our findings over 1-year time span. By December 25, 2020, 95.3% of global cases were M type and 93.0% of M-type cases were WE1. In fact, at present all the five variants of concern (VOC) are the descendants of WE1 type. This study demonstrates that viral genotypes can be utilized as molecular barcodes in combination with epidemiologic data to monitor the spreading routes of the pandemic and evaluate the effectiveness of control measures. Moreover, the dynamics of viral mutational spectrum in the study may help the early identification of new strains in patients to reduce further spread of infection, guide the development of molecular diagnosis and vaccines against COVID-19, and help assess their accuracy and efficacy in real world at real time.
virology
What problem does this paper attempt to address?