Identification of B cell subsets based on antigen receptor sequences using deep learning

Hyunho Lee,Kyoungseob Shin,Yongju Lee,Soobin Lee,Seungyoun Lee,Eunjae Lee,Seung Woo Kim,Ha Young Shin,Jong Hoon Kim,Junho Chung,Sunghoon Kwon
DOI: https://doi.org/10.1101/2024.02.06.579098
2024-02-08
Abstract:B cell receptors (BCRs) denote antigen specificity, while corresponding cell subsets indicate B cell functionality. Since each B cell uniquely encodes this combination, physical isolation and subsequent processing of individual B cells become indispensable to identify both attributes. However, this approach accompanies high costs and inevitable information loss, hindering high-throughput investigation of B cell populations. Here, we present BCR-SORT, a deep learning model that predicts cell subsets from their corresponding BCR sequences by leveraging B cell activation and maturation signatures encoded within BCR sequences. Subsequently, BCR-SORT is demonstrated to improve reconstruction of BCR phylogenetic trees, and reproduce results consistent with those verified using physical isolation-based methods or prior knowledge. Notably, when applied to BCR sequences from COVID-19 vaccine recipients, it revealed inter-individual heterogeneity of evolutionary trajectories towards Omicron-binding memory B cells. Overall, BCR-SORT offers great potential to improve our understanding of B cell responses.
Bioinformatics
What problem does this paper attempt to address?