CELEBRIMBOR: core and accessory genes from metagenomes

Joel Hellewell,Samuel T Horsfield,Johanna von Wachsmann,Tatiana A Gurbich,Robert D Finn,Zamin Iqbal,Leah W Roberts,John A Lees
DOI: https://doi.org/10.1093/bioinformatics/btae542
IF: 5.8
2024-09-02
Bioinformatics
Abstract:Motivation: Metagenome-Assembled Genomes (MAGs) or Single-cell Amplified Genomes (SAGs) are often incomplete, with sequences missing due to errors in assembly or low coverage. This presents a particular challenge for the identification of true gene frequencies within a microbial population, as core genes missing in only a few assemblies will be mischaracterized by current pangenome approaches. Results: Here, we present CELEBRIMBOR, a Snakemake pangenome analysis pipeline which uses a measure of genome completeness to automatically adjust the frequency threshold at which core genes are identified, enabling accurate core gene identification in MAGs and SAGs. Availability and implementation: CELEBRIMBOR is published under open source Apache 2.0 licence at https://github.com/bacpop/CELEBRIMBOR and is available as a Docker container from this repository. Supplementary material is available in the online version of the article.
What problem does this paper attempt to address?