Machine Learning-Aided Analyses of Thousands of Draft Genomes Reveal Specific Features of Activated Sludge Processes

Ye Lin,Mei Ran,Liu Wen-Tso,Ren Hongqiang,Zhang Xu-Xiang
DOI: https://doi.org/10.1186/s40168-020-0794-3
IF: 15.5
2020-01-01
Microbiome
Abstract:Microorganisms in activated sludge (AS) play key roles in the wastewater treatment process. However, the ecological behavior of microorganisms in AS and their differences with microorganisms in other environments have mainly been studied using 16S rRNA gene that may not truly represent their in-situ functions. Here, we present 2045 bacterial and archaeal metagenome-assembled genomes (MAGs) recovered from 1.35 Tb of metagenomic sequencing data generated from 114 AS samples of 23 full-scale wastewater treatment plants (WWTPs). The average completeness and contamination of the MAGs are 82.0% and 2.0%, respectively. We find that the AS MAGs have obviously plant-specific features and few proteins are shared by different WWTPs, especially for WWTPs located in geographically distant areas. Despite the differences, specific functional traits (e.g. functions related to aerobic metabolism, nutrient sensing/acquisition, biofilm formation, etc.) of AS MAGs could be identified by a machine learning approach, and based on these traits, AS MAGs could be differentiated from MAGs of other environments with an accuracy of 96.6%. Our work provides valuable genome resources for future investigation of the AS microbiome and also introduces a novel approach to understand the microbial ecology in different ecosystems.
What problem does this paper attempt to address?