Guidelines for reproducible analysis of adaptive immune receptor repertoire sequencing data

Ayelet Peres,Vered Klein,Boaz Frankel,William Lees,Pazit Polak,Mark Meehan,Artur Rocha,João Correia Lopes,Gur Yaari
DOI: https://doi.org/10.1093/bib/bbae221
2024-03-27
Abstract:Enhancing the reproducibility and comprehension of adaptive immune receptor repertoire sequencing (AIRR-seq) data analysis is critical for scientific progress. This study presents guidelines for reproducible AIRR-seq data analysis, and a collection of ready-to-use pipelines with comprehensive documentation. To this end, ten common pipelines were implemented using ViaFoundry, a user-friendly interface for pipeline management and automation. This is accompanied by versioned containers, documentation and archiving capabilities. The automation of pre-processing analysis steps and the ability to modify pipeline parameters according to specific research needs are emphasized. AIRR-seq data analysis is highly sensitive to varying parameters and setups; using the guidelines presented here, the ability to reproduce previously published results is demonstrated. This work promotes transparency, reproducibility, and collaboration in AIRR-seq data analysis, serving as a model for handling and documenting bioinformatics pipelines in other research domains.
What problem does this paper attempt to address?