Systematic benchmarking of single-cell ATAC-sequencing protocols

Florian V De Rop,Gert Hulselmans,Chris Flerin,Paula Soler-Vila,Albert Rafels,Valerie Christiaens,Carmen Bravo González-Blas,Domenica Marchese,Ginevra Caratù,Suresh Poovathingal,Orit Rozenblatt-Rosen,Michael Slyper,Wendy Luo,Christoph Muus,Fabiana Duarte,Rojesh Shrestha,S Tansu Bagdatli,M Ryan Corces,Lira Mamanova,Andrew Knights,Kerstin B Meyer,Ryan Mulqueen,Akram Taherinasab,Patrick Maschmeyer,Jörn Pezoldt,Camille Lucie Germaine Lambert,Marta Iglesias,Sebastián R Najle,Zain Y Dossani,Luciano G Martelotto,Zach Burkett,Ronald Lebofsky,José Ignacio Martin-Subero,Satish Pillai,Arnau Sebé-Pedrós,Bart Deplancke,Sarah A Teichmann,Leif S Ludwig,Theodore P Braun,Andrew C Adey,William J Greenleaf,Jason D Buenrostro,Aviv Regev,Stein Aerts,Holger Heyn
DOI: https://doi.org/10.1038/s41587-023-01881-x
Abstract:Single-cell assay for transposase-accessible chromatin by sequencing (scATAC-seq) has emerged as a powerful tool for dissecting regulatory landscapes and cellular heterogeneity. However, an exploration of systemic biases among scATAC-seq technologies has remained absent. In this study, we benchmark the performance of eight scATAC-seq methods across 47 experiments using human peripheral blood mononuclear cells (PBMCs) as a reference sample and develop PUMATAC, a universal preprocessing pipeline, to handle the various sequencing data formats. Our analyses reveal significant differences in sequencing library complexity and tagmentation specificity, which impact cell-type annotation, genotype demultiplexing, peak calling, differential region accessibility and transcription factor motif enrichment. Our findings underscore the importance of sample extraction, method selection, data processing and total cost of experiments, offering valuable guidance for future research. Finally, our data and analysis pipeline encompasses 169,000 PBMC scATAC-seq profiles and a best practices code repository for scATAC-seq data analysis, which are freely available to extend this benchmarking effort to future protocols.
What problem does this paper attempt to address?