scPathoQuant: A tool for efficient alignment and quantification of pathogen sequence reads from 10x single cell sequencing data sets

Leanne S Whitmore,Jennifer Tisoncik-Go,Michael Gale
DOI: https://doi.org/10.1093/bioinformatics/btae145
IF: 5.8
2024-03-13
Bioinformatics
Abstract:Abstract Motivation Currently there is a lack of efficient computational pipelines/tools for conducting simultaneous genome mapping of pathogen-derived and host reads from single cell RNA sequencing (scRNAseq) output from pathogen-infected cells. Contemporary options include processes involving multiple steps and/or running multiple computational tools, increasing user operations time. Results To address the need for new tools to directly map and quantify pathogen and host sequence reads from within an infected cell from scRNAseq data sets in a single operation, we have built a python package, called scPathoQuant. scPathoQuant extracts sequences that were not aligned to the primary host genome, maps them to a pathogen genome of interest, here as demonstrated for viral pathogens, quantifies total reads mapping to the entire pathogen, quantifies reads mapping to individual pathogen genes, and finally reintegrates pathogen sequence counts into matrix files that are used by standard single cell pipelines for downstream analyses with only one command. We demonstrate that scPathoQuant provides a scRNAseq viral and host genome-wide sequence read abundance analysis that can differentiate and define multiple viruses in a single sample scRNAseq output. Availability The SPQ package is available software accessible at https://github.com/galelab/scPathoQuant (DOI10.5281/zenodo.10463670) with test codes and data sets available https://github.com/galelab/Whitmore_scPathoQuant_testSets (DOI 10.5281/zenodo.10463677) to serve as a resource for the community.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?