Fast and accurate modeling and design of antibody-antigen complex using tFold

Fandi Wu,Yu Zhao,Jiaxiang Wu,Biaobin Jiang,Bing He,Longkai Huang,Chenchen Qin,Fan Yang,Ningqiao Huang,Yang Xiao,Rubo Wang,Huaxian Jia,Yu Rong,Yuyi Liu,Houtim Lai,Tingyang Xu,Wei Liu,Peilin Zhao,Jianhua Yao
DOI: https://doi.org/10.1101/2024.02.05.578892
2024-02-08
Abstract:Accurate prediction of antibody-antigen complex structures holds significant potential for advancing biomedical research and the design of therapeutic antibodies. Currently, structure prediction for protein monomers has achieved considerable success, and promising progress has been made in extending this achievement to the prediction of protein complexes. However, despite these advancements, fast and accurate prediction of antibody-antigen complex structures remains a challenging and unresolved issue. Existing end-to-end prediction methods, which rely on homology and templates, exhibit sub-optimal accuracy due to the absence of co-evolutionary constraints. Meanwhile, conventional docking-based methods face difficulties in identifying the contact interface between the antigen and antibody and require known structures of individual components as inputs. In this study, we present a fully end-to-end approach for three-dimensional (3D) atomic-level structure predictions of antibodies and antibody-antigen complexes, referred to as tFold-Ab and tFold-Ag, respectively. tFold leverages a large protein language model to extract both intra-chain and inter-chain residue-residue contact information, as well as evolutionary relationships, avoiding the time-consuming multiple sequence alignment (MSA) search. Combined with specially designed modules such as the AI-driven flexible docking module, it achieves superior performance and significantly enhanced speed in predicting both antibody (1.6% RMSD reduction in the CDR-H3 region, thousand times faster) and antibody-antigen complex structures (37% increase in DockQ score, over 10 times faster), compared to AlphaFold-Multimer. Given the performance and speed advantages, we further extend the capability of tFold for structure-based virtual screening of binding antibodies, as well as de novo co-design of both structure and sequence for therapeutic antibodies. The experiment results demonstrate the potential of tFold as a high-throughput tool to enhance processes involved in these tasks. To facilitate public access, we release code and offer a web service for antibody and antigen-antibody complex structure prediction, which is available at .
Bioinformatics
What problem does this paper attempt to address?
This paper introduces a new method called tFold for fast and accurate prediction of the three-dimensional atomic structure of antibody-antigen complexes. Currently, although significant progress has been made in predicting the structure of individual proteins, predicting antibody-antigen complexes remains a challenging problem due to the limitations in accuracy or speed of existing methods. tFold employs a large-scale protein language model to extract intrachain and interchain residue-residue contact information and evolutionary relationships, avoiding the time-consuming multiple sequence alignment. This approach includes an AI-driven flexible docking module, enhancing the performance and speed of antibody prediction (reducing CDR-H3 region RMSD by 1.6%) and antibody-antigen complex structure (improving DockQ score by 37%, with speed increased by over 10 times). Furthermore, tFold can also be used for structure-based virtual screening of binding antibodies and de novo design of therapeutic antibody structures and sequences. Experimental results demonstrate that tFold has the potential to become a high-throughput tool, accelerating related tasks, and the code and web service have been released for public use.