Microbial genome analysis: the COG approach

Michael Y Galperin,David M Kristensen,Kira S Makarova,Yuri I Wolf,Eugene V Koonin
DOI: https://doi.org/10.1093/bib/bbx117
IF: 9.5
2017-09-14
Briefings in Bioinformatics
Abstract:Abstract For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?