Identify phage hosts from metaviromic short reads based on deep learning and Markov chain model

Jie Tan,Zhencheng Fang,Shufang Wu,Qian Guo,Xiaoqing Jiang,Huaiqiu Zhu
DOI: https://doi.org/10.1101/2021.03.01.433351
IF: 5.8
2021-01-01
Bioinformatics
Abstract:Phages - viruses that infect bacteria and archaea - are dominant in the virosphere and play an important role in the microbial community. It is very important to identify the host of a given phage fragment from metavriome data for understanding the ecological impact of phage in a microbial community. State-of-the-art tools for host identification only present reliable results on long sequences within a narrow candidate host range, while there are a large number of short fragments in real metagenomic data and the taxonomic composition of a microbial community is often complicated. Here, we present a method, named HoPhage, to identify the host of a given phage fragment from metavirome data at the genus level. HoPhage integrates two modules using the deep learning algorithms and the Markov chain model, respectively. By testing on both the artificial benchmark dataset of phage contigs and the real virome data, HoPhage demonstrates a satisfactory performance on short fragments within a wide candidate host range at every taxonomic level. HoPhage is freely available at <http://cqb.pku.edu.cn/ZhuLab/HoPhage/>. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?