Elimination of Foreign Sequences in Eukaryotic Viral Reference Genomes Improves the Accuracy of Virome Analysis

Junjie Chen,Yue Sun,Xiaomin Yan,Zilin Ren,Guoshuai Wang,Yuhang Liu,Zihan Zhao,Le Yi,Changchun Tu,Biao He
DOI: https://doi.org/10.1128/msystems.00907-22
2022-10-27
mSystems
Abstract:High-throughput sequencing-based viromics highly depends on reference databases, but foreign contamination is widespread in public databases and often leads to confusing and even wrong conclusions in genomic analysis and viromic profiling. To address this issue, we systematically detected and identified the contamination in the largest viral sequence collections of GenBank and UniProt based on a stringent scrutiny pipeline.
microbiology
What problem does this paper attempt to address?