scfetch: an R package to access and format single-cell RNA sequencing datasets from public repositories

Yabing Song,Jiaxin Gao,Jianbin Wang
DOI: https://doi.org/10.1101/2023.11.18.567507
2023-01-01
Abstract:Summary Downloading and reanalyzing the existing single-cell RNA sequencing (scRNA-seq) datasets is an efficient method to gain clues or new insights. However, there is no tool to access diverse scRNA-seq datasets ( fastq / bam files, count matrices and processed objects) distributed in various repositories, consider features of datasets from different scRNA-seq protocols, and prepare for downstream analysis. Here, we present scfetch , an R package to download diverse scRNA-seq datasets from SRA, GEO, PanglaoDB, UCSC Cell Browser, Zenodo and CELLxGENE, and load the downloaded datasets to Seurat. scfetch supports scRNA-seq datasets generated by different protocols such as 10x Genomics and Smart-seq2. Besides, scfetch enables users to convert formats between different scRNA-seq objects, including SeuratObject, Anndata, SingleCellExperiment, CellDataSet / cell\_data\_set and loom . Furthermore, scfetch also supports downloading fastq / bam files and count matrices of bulk RNA-seq from SRA and GEO. Availability and Implementation The scfetch package and vignettes are freely available at <https://github.com/showteeth/scfetch> and <https://showteeth.github.io/scfetch/>. Contact gaojx{at}im.ac.cn, jianbinwang{at}tsinghua.edu.cn. Supplementary information Supplementary data are appended. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?