Evolution and Variation of 2019-Novel Coronavirus

Chenglong Xiong,Lufang Jiang,Yue Chen,Qingwu Jiang
DOI: https://doi.org/10.1101/2020.01.30.926477
2020-01-01
Abstract:Background The current outbreak caused by novel coronavirus (2019-nCoV) in China has become a worldwide concern. As of 28 January 2020, there were 4631 confirmed cases and 106 deaths, and 11 countries or regions were affected.Methods We downloaded the genomes of 2019-nCoVs and similar isolates from the Global Initiative on Sharing Avian Influenza Database (GISAID and nucleotide database of the National Center for Biotechnology Information (NCBI). Lasergene 7.0 and MEGA 6.0 softwares were used to calculate genetic distances of the sequences, to construct phylogenetic trees, and to align amino acid sequences. Bayesian coalescent phylogenetic analysis, implemented in the BEAST software package, was used to calculate the molecular clock related characteristics such as the nucleotide substitution rate and the most recent common ancestor (tMRCA) of 2019-nCoVs.Results An isolate numbered EPI\_ISL\_403928 showed different phylogenetic trees and genetic distances of the whole length genome, the coding sequences (CDS) of ployprotein (P), spike protein (S), and nucleoprotein (N) from other 2019-nCoVs. There are 22, 4, 2 variations in P, S, and N at the level of amino acid residues. The nucleotide substitution rates from high to low are 1·05 × 10−2 (nucleotide substitutions/site/year, with 95% HPD interval being 6.27 × 10−4 to 2.72 × 10−2) for N, 5.34 × 10−3 (5.10 × 10−4, 1.28 × 10−2) for S, 1.69 × 10−3 (3.94 × 10−4, 3.60 × 10−3) for P, 1.65 × 10−3 (4.47 × 10−4, 3.24 × 10−3) for the whole genome, respectively. At this nucleotide substitution rate, the most recent common ancestor (tMRCA) of 2019-nCoVs appeared about 0.253-0.594 year before the epidemic.Conclusion Our analysis suggests that at least two different viral strains of 2019-nCoV are involved in this outbreak that might occur a few months earlier before it was officially reported.* CoVs : Coronaviruses 2019-nCoV : 2019-novel coronavirus SARS-CoV : severe acute respiratory syndrome coronavirus MERS-CoV : Middle East respiratory syndrome coronavirus CDS : coding sequence tMRCA : the most recent common ancestor GISAID : the Global Initiative on Sharing Avian Influenza Database ESSs : Effective sample sizes
What problem does this paper attempt to address?