AsaruSim: a single-cell and spatial RNA-Seq Nanopore long-reads simulation workflow

Ali Hamraoui,Morgane Thomas-Chollier,Laurent Jourdren
DOI: https://doi.org/10.1101/2024.09.20.613625
2024-09-24
Abstract:Motivation: The combination of long-read sequencing technologies like Oxford Nanopore with single-cell RNA sequencing (scRNAseq) assays enables the detailed exploration of transcriptomic complexity, including isoform detection and quantification, by capturing full-length cDNAs. However, challenges remain, including the lack of advanced simulation tools that can effectively mimic the unique complexities of scRNAseq long-read datasets. Such tools are essential for the evaluation and optimization of isoform detection methods dedicated to single-cell long readstudies. Results: We developed AsaruSim, a workflow that simulates synthetic single-cell long-read Nanopore datasets, closely mimicking real experimental data. AsaruSim employs a multi-step process that includes the creation of a synthetic UMI count matrix, generation of perfect reads, optional PCR amplification, introduction of sequencing errors, and comprehensive quality control reporting. Applied to a dataset of human peripheral blood mononuclear cells (PBMCs), AsaruSim accurately reproduced experimental read characteristics. Availability and implementation: The source code and full documentation are available at: https://github.com/GenomiqueENS/AsaruSim. Data availability: The 1,090 Human PBMCs count matrix and cell type annotation files are accessible on zenodo under DOI: 10.5281/zenodo.12731408.
Bioinformatics
What problem does this paper attempt to address?