TaxSEA: an R package for rapid interpretation of differential abundance analysis output.

Feargal J Ryan
DOI: https://doi.org/10.1101/2024.11.20.624438
2024-11-21
Abstract:Microbial communities are essential regulators of ecosystem function, with their composition commonly assessed through DNA sequencing. Most current tools focus on detecting changes among individual taxa (e.g., species or genera), however in other omics fields, such as transcriptomics, enrichment analyses like Gene Set Enrichment Analysis (GSEA) are commonly used to uncover patterns not seen with individual features. Here, we introduce TaxSEA, an R package for taxon set enrichment analysis. TaxSEA integrates taxon sets from five public microbiota databases (BugSigDB, MiMeDB, GutMGene, mBodyMap, and GMRepoV2) to assess whether disease signatures, metabolite producers, or previously reported associations are enriched or depleted in a metagenomic dataset of interest. In silico assessments show TaxSEA is accurate across a range of set sizes. When applied to differential abundance analysis output from Inflammatory Bowel Disease and Type 2 Diabetes metagenomic data, TaxSEA outperforms current tools and can rapidly identify changes in functional groups corresponding to known associations. We also show that TaxSEA is robust to the choice of differential abundance (DA) analysis package. In summary, TaxSEA enables researchers to efficiently contextualize their findings within the broader microbiome literature, facilitating rapid interpretation and advancing understanding of microbiome-host and environmental interactions.
Biology
What problem does this paper attempt to address?