On generative models of T-cell receptor sequences

Giulio Isacchini,Zachary Sethna,Yuval Elhanati,Armita Nourmohammad,Aleksandra M. Walczak,Thierry Mora
DOI: https://doi.org/10.1103/PhysRevE.101.062414
2020-03-14
Abstract:T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided approach, which accounts for the details of sequence generation, supplemented by a physics-inspired model of selection; and a knowledge-free Variational Auto-Encoder based on deep artificial neural networks. We show that the knowledge-guided model outperforms the deep network approach at predicting TCR probabilities, while being more interpretable, at a lower computational cost.
Quantitative Methods
What problem does this paper attempt to address?