OTUCD: Unsupervised GCN based metagenomics non-overlapping community detection

Zhongqing Zhang,Qiqi Jiao,Yang Zhang,Bo Liu,Yadong Wang,Junyi Li
DOI: https://doi.org/10.1016/j.compbiolchem.2022.107670
Abstract:Metagenomics is a discipline that studies the genetic material of all tiny organisms in the biological environment. In recent years, the interaction between metagenomic microbial communities, the transfer of horizontal genes, and the dynamic changes of microbial ecosystems have attracted more and more attention. It is of great significance to use the community detection algorithm to divide the metagenomic microbes into modules, and it has a positive guiding role for the follow-up research on human, drug, microbial interaction study and drug prediction and development. At present, there are challenges in mining the effective information hidden in large-scale microbial sequence data. The non-linear characteristics and non-scalability of microbial sequence data still bother people. This paper proposes an end-to-end unsupervised GCN learning model OTUCD (Operational Classification Unit Community Detection), which divides large-scale metagenomic sequence data into potential gene modules. We construct an OTU network, and then performs subsequent nonoverlapping community detection task with graph convolutional networks. Experimental scores show that the community detection effect of this method is better than other latest metagenomic algorithms.
What problem does this paper attempt to address?