Genomic representation predicts an asymptotic host adaptation of bat coronaviruses using deep learning

Jing Li,Fengjuan Tian,Sen Zhang,Shun-Shuai Liu,Xiao-Ping Kang,Ya-Dan Li,Jun-Qing Wei,Wei Lin,Zhongyi Lei,Ye Feng,Jia-Fu Jiang,Tao Jiang,Yigang Tong
DOI: https://doi.org/10.3389/fmicb.2023.1157608
IF: 5.2
2023-05-06
Frontiers in Microbiology
Abstract:Introduction: Coronaviruses (CoVs) are naturally found in bats and can occasionally cause infection and transmission in humans and other mammals. Our study aimed to build a deep learning (DL) method to predict the adaptation of bat CoVs to other mammals. Methods: The CoV genome was represented with a method of dinucleotide composition representation (DCR) for the two main viral genes, ORF1ab and S pike . DCR features were first analyzed for their distribution among adaptive hosts and then trained with a DL classifier of convolutional neural networks (CNN) to predict the adaptation of bat CoVs. Results and discussion: The results demonstrated inter-host separation and intra-host clustering of DCR-represented CoVs for six host types: Artiodactyla, Carnivora, Chiroptera, Primates, Rodentia/Lagomorpha, and Suiformes. The DCR-based CNN with five host labels (without Chiroptera) predicted a dominant adaptation of bat CoVs to Artiodactyla hosts, then to Carnivora and Rodentia/Lagomorpha mammals, and later to primates. Moreover, a linear asymptotic adaptation of all CoVs (except Suiformes) from Artiodactyla to Carnivora and Rodentia/Lagomorpha and then to Primates indicates an asymptotic bats-other mammals-human adaptation. Conclusion: Genomic dinucleotides represented as DCR indicate a host-specific separation, and clustering predicts a linear asymptotic adaptation shift of bat CoVs from other mammals to humans via deep learning.
microbiology
What problem does this paper attempt to address?