MetaHMM: A webserver for identifying novel genes with specified functions in metagenomic samples

Balázs Szalkai,Vince Grolmusz
DOI: https://doi.org/10.1016/j.ygeno.2018.05.016
IF: 4.31
2019-07-01
Genomics
Abstract:The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial communities in extreme environments may contain genes with high biotechnological potential, and clinical metagenomes, related to diseases, may uncover still unknown pathogens and pathological mechanisms in known diseases. While the species-level identification and description of the taxa in the samples do not seem to be possible today, we can search for novel genes with known functions in these samples, using numerous techniques, including artificial intelligence tools, like the hidden Markov models (HMMs). Here we describe a simple-to-use webserver, the MetaHMM, which is capable of homology-based automatic model-building for the genes to be searched for, and it also finds the closest matches in the metagenome. The webserver uses already highly successful building blocks: it performs multiple alignments by applying Clustal Omega, builds a hidden Markov model with HMMER components of hmmbuild and uses hmmsearch for finding similar sequences to the specified model in the metagenomes. The webserver is publicly available at https://metahmm.pitgroup.org.
genetics & heredity,biotechnology & applied microbiology
What problem does this paper attempt to address?