Sequence-based TCR-Peptide Representations Using Cross-Epitope Contrastive Fine-tuning of Protein Language Models

Chiho Im,Ryan Zhao,Scott D. Boyd,Anshul Kundaje
DOI: https://doi.org/10.1101/2024.10.25.619698
2024-10-29
Abstract:Understanding T-Cell receptor (TCR) and epitope interactions is critical for advancing our knowledge of the human immune system. Traditional approaches that use sequence similarity or structure data often struggle to scale and generalize across diverse TCR/epitope interactions. To address these limitations, we introduce ImmuneCLIP, a contrastive fine-tuning method that leverages pre-trained protein language models to align TCR and epitope embeddings in a shared latent space. ImmuneCLIP is evaluated on epitope ranking and binding prediction tasks, where it consistently outperforms sequence-similarity based methods and existing deep learning models. Furthermore, ImmuneCLIP shows strong generalization capabilities even with limited training data, highlighting its potential for studying diverse immune interactions and uncovering patterns that improve our understanding of human immune recognition systems.
Immunology
What problem does this paper attempt to address?