AntiFold: Improved antibody structure-based design using inverse folding

Magnus Haraldson Høie,Alissa Hummer,Tobias H. Olsen,Broncio Aguilar-Sanjuan,Morten Nielsen,Charlotte M. Deane
2024-05-06
Abstract:The design and optimization of antibodies requires an intricate balance across multiple properties. Protein inverse folding models, capable of generating diverse sequences folding into the same structure, are promising tools for maintaining structural integrity during antibody design. Here, we present AntiFold, an antibody-specific inverse folding model, fine-tuned from ESM-IF1 on solved and predicted antibody structures. AntiFold outperforms existing inverse folding tools on sequence recovery across complementarity-determining regions, with designed sequences showing high structural similarity to their solved counterpart. It additionally achieves stronger correlations when predicting antibody-antigen binding affinity in a zero-shot manner, while performance is augmented further when including antigen information. AntiFold assigns low probabilities to mutations that disrupt antigen binding, synergizing with protein language model residue probabilities, and demonstrates promise for guiding antibody optimization while retaining structure-related properties. AntiFold is freely available under the BSD 3-Clause as a web server at
Biomolecules,Quantitative Methods
What problem does this paper attempt to address?
This paper aims to solve the key problems in antibody design, that is, to optimize multiple properties of antibodies while maintaining structural integrity. Specifically, the paper introduces AntiFold, an inverse folding model for antibodies. It improves the quality of antibody design by fine - tuning from solved and predicted antibody structures. The main contributions of AntiFold are as follows: 1. **Improve sequence recovery ability**: Especially in the sequence recovery of Complementarity - Determining Regions (CDRs), AntiFold outperforms existing inverse folding tools. This makes the designed sequence have a high structural similarity with the original structure after refolding. 2. **Predict antibody - antigen binding affinity**: AntiFold shows a stronger correlation in predicting antibody - antigen binding affinity in the zero - sample case. When antigen information is added, the performance is further improved, especially when the CDR region is close to the antigen - binding site. 3. **Guide antibody optimization**: AntiFold can provide guidance for antibody optimization. By giving priority to the small - range search space verified by experiments, it reduces the experimental burden. It assigns a low probability to mutations that disrupt antigen binding and cooperates with the residue probability of the protein language model, thereby retaining structure - related properties. ### Main technical details - **Dataset**: AntiFold takes advantage of the pre - training of ESM - IF1 and is further fine - tuned on solved and predicted antibody structures. The training, validation, and test datasets are from the Structural Antibody Database (SAbDab) and Observed Antibody Space (OAS) databases respectively. - **Training strategy**: In order to improve the performance on the validation set, especially the Amino Acid Recovery (AAR) in the heavy - chain CDR3 region, multiple strategies are adopted, including span masking, random residue masking, weight masking biased towards CDR residues, inter - layer learning rate decay, and including OAS - predicted structures. - **Performance evaluation**: The amino acid recovery rate of AntiFold on the experimental structure test set is significantly higher than that of the original ESM - IF1 model, and it outperforms other existing tools such as AbMPNN in multiple indicators. In addition, the generated sequences have high structural consistency with the original structure after refolding. ### Application scenarios - **Antibody design**: AntiFold can be used to design new antibody sequences while maintaining their structural characteristics, which is particularly important for the design of therapeutic antibodies. - **Binding affinity prediction**: AntiFold can predict antibody - antigen binding affinity, which is helpful for screening high - affinity antibody variants. - **Optimizing experimental design**: By reducing the search space for experimental verification, AntiFold can guide the antibody optimization process and improve the R & D efficiency. ### Conclusion By fine - tuning on antibody - specific structures, AntiFold significantly improves the quality of antibody design, especially in sequence recovery and binding affinity prediction. This tool provides strong support for antibody optimization and is expected to accelerate the R & D process of therapeutic antibodies.