Uncovering 1,058 novel human enteric DNA viruses through deep long-read third-generation sequencing and their clinical impact

Liuyang Zhao,Yu Shi,Harry Cheuk-Hay Lau,Weixin Liu,Guangwen Luo,Guoping Wang,Changan Liu,Yasi Pan,Qiming Zhou,Yanqiang Ding,Joseph Jao-Yiu Sung,Jun Yu
DOI: https://doi.org/10.1053/j.gastro.2022.05.048
IF: 29.4
2022-06-08
Gastroenterology
Abstract:Background & Aims Lack of viral reference genomes poses a challenge to virome study. We investigated human gut virome and its clinical implication by ultra-deep metagenomic sequencing. Methods We extracted sufficient viral DNA from human faeces for ultra-deep PacBio sequencing (>10μg) and Illumina sequencing (>1μg). Upon de novo assembly and six-stages of strict filtering, viral genomes were generated and validated in three cohorts of 2,819 published faecal metagenomes. Diagnostic performance of assembled viruses for colorectal cancer (CRC) were tested in a training and two independent validation cohorts. Virus mapping ratio, evolutionary history, and virus status (lytic/temperate) were also examined. Results The mean amount of extracted viral DNA increased by 14-fold compared to previous protocols. We obtained PacBio long-reads and Illumina short-reads with 290-fold higher depth than previous studies. We assembled and validated 1,178 contigs as complete viral genomes, of which 1,058 were newly identified. 13 viral genomes (398-839kb) that are longer than the largest bacteriophage found in human (393 kb) were discovered. Phylogenetic tree was constructed based on Hidden Markov Models alignment scores of 4 conserved viral proteins. Incorporating our assembled genomes into NCBI database improved mapping ratio of published metagenomes ≤ 18 times. Lytic viruses (75.9%±12.2% of total) were predominantly present in our sample. A biomarker panel of 14 novel viruses could discriminate CRC patients from controls with AUC 0.87 in the training cohort, which was validated with AUCs 0.85 and 0.73 in two independent cohorts. Conclusion We uncovered 1,058 novel human gut viruses. These findings can contribute to clinical diagnosis, current viral reference genome and future virome investigation.
gastroenterology & hepatology
What problem does this paper attempt to address?