Accurate sequence-to-affinity models for SH2 domains from multi-round peptide binding assays coupled with free-energy regression

Dejan Gagoski,H. Tomas Rube,Chaitanya Rastogi,Lucas A.N. Melo,Xiaoting Li,Rashmi Voleti,Neel H Shah,Harmen J Bussemaker
DOI: https://doi.org/10.1101/2024.12.23.630085
2024-12-23
Abstract:Short linear peptide motifs play important roles in cell signaling. They can act as modification sites for enzymes and as recognition sites for peptide binding domains. SH2 domains bind specifically to tyrosine-phosphorylated proteins, with the affinity of the interaction depending strongly on the flanking sequence. Quantifying this sequence specificity is critical for deciphering phosphotyrosine-dependent signaling networks. In recent years, protein display technologies and deep sequencing have allowed researchers to profile SH2 domain binding across thousands of candidate ligands. Here, we present a concerted experimental and computational strategy that improves the predictive power of SH2 specificity profiling. Through multi-round affinity selection and deep sequencing with large randomized phosphopeptide libraries, we produce suitable data to train an additive binding free energy model that covers the full theoretical ligand sequence space. Our models can be used to predict signaling network connectivity and the impact of missense variants in phosphoproteins on SH2 binding.
Systems Biology
What problem does this paper attempt to address?