DeepCheck: multitask learning aids in assessing microbial genome quality

Guo Wei,Nannan Wu,Kunyang Zhao,Sihai Yang,Long Wang,Yan Liu
DOI: https://doi.org/10.1093/bib/bbae539
IF: 9.5
2024-10-23
Briefings in Bioinformatics
Abstract:Metagenomic analyses facilitate the exploration of the microbial world, advancing our understanding of microbial roles in ecological and biological processes. A pivotal aspect of metagenomic analysis involves assessing the quality of metagenome-assembled genomes (MAGs), crucial for accurate biological insights. Current machine learning–based methods often treat completeness and contamination prediction as separate tasks, overlooking their inherent relationship and limiting models' generalization. In this study, we present DeepCheck, a multitasking deep learning framework for simultaneous prediction of MAG completeness and contamination. DeepCheck consistently outperforms existing tools in accuracy across various experimental settings and demonstrates comparable speed while maintaining high predictive accuracy even for new lineages. Additionally, we employ interpretable machine learning techniques to identify specific genes and pathways that drive the model's predictions, enabling independent investigation and assessment of these biological elements for deeper insights.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?