Protein language models enable prediction of polyreactivity of monospecific, bispecific, and heavy-chain-only antibodies

Xin Yu,Kostika Vangjeli,Anusha Prakash,Meha Chhaya,Samantha J Stanley,Noah Cohen,Lili Huang,Yu,X.,Vangjeli,K.,Prakash,A.,Chhaya,M.,Stanley,S.,Cohen,N.,Huang,L.
DOI: https://doi.org/10.1101/2023.11.06.565888
2023-11-08
bioRxiv
Abstract:Early assessment of antibody off-target binding is essential for mitigating developability risks such as fast clearance, reduced efficacy, toxicity, and immunogenicity. The baculovirus particle (BVP) binding assay has been widely utilized to evaluate polyreactivity of antibodies. As a complementary approach, computational prediction of polyreactivity is desirable for counter-screening antibodies from in silico discovery campaigns. However, there is a lack of such models. Herein, we present the development of an ensemble of three deep learning models based on two pan-protein foundational protein language models (ESM2 and ProtT5) and an antibody-specific protein language model (Antiberty). These models were trained in a transfer learning network to predict the outcomes in the BVP assay and the bovine serum albumin (BSA) binding assay which was developed as a complement to the BVP assay. The training was conducted on a large dataset of antibody sequences augmented with experimental conditions, which were collected through a highly efficient application system. The resulting models demonstrated robust performance on normal mAbs (monospecific with heavy and light chain), bispecific Abs, and single-domain Fc (VHH-Fc). Protein language models outperformed a model built using molecular descriptors calculated from AlphaFold 2 predicted structures. Embeddings from the antibody-specific and foundational protein language models resulted in similar performance. To our knowledge, this represents the first application of protein language models to predict assay data on bispecifics and VHH-Fcs. Our study yields valuable insights on building infrastructures to support machine learning activities and training models for critical assays in antibody discovery.
English Else
What problem does this paper attempt to address?