Chai-1: Decoding the molecular interactions of life

Chai Discovery,Jacques Boitreaud,Jack Dent,Matthew McPartlon,Joshua Meier,Vinicius Reis,Alex Rogozhnikov,Kevin Wu
DOI: https://doi.org/10.1101/2024.10.10.615955
2024-10-15
Abstract:We introduce Chai-1, a multi-modal foundation model for molecular structure prediction that performs at the state-of-the-art across a variety of tasks relevant to drug discovery. Chai-1 can optionally be prompted with experimental restraints (e.g. derived from wet-lab data) which boosts performance by double-digit percentage points. Chai-1 can also be run in single-sequence mode without MSAs while preserving most of its performance. We release Chai-1 model weights and inference code as a python package for non-commercial use and via a web interface where it can be used for free including for commercial drug discovery purposes.
Synthetic Biology
What problem does this paper attempt to address?
The paper attempts to address multiple challenges in biomolecular structure prediction, particularly in the field of drug discovery. Specifically, the paper introduces Chai-1, a multimodal foundational model designed to predict biomolecular structures and demonstrate state-of-the-art performance in various drug discovery-related tasks. ### Main Issues Include: 1. **Protein-Ligand Structure Prediction**: Accurately predicting the interactions between proteins and small molecule ligands, which is crucial for understanding how drugs bind to their targets. 2. **Polyprotein Prediction**: Predicting the structure of polyprotein complexes, which is important for understanding protein-protein interactions and designing multi-target drugs. 3. **Performance in Single Sequence Mode**: Achieving high-accuracy structure prediction without multiple sequence alignment (MSA), which is useful for rapid and low-cost structure prediction. 4. **Utilization of Experimental Constraints**: Improving prediction accuracy by incorporating experimental data (such as epitope mapping or cross-linking mass spectrometry data), especially when dealing with complex binding structures. 5. **Antibody-Antigen Interactions**: Focusing on the prediction of antibody-antigen complexes, which is significant for developing immunotherapeutic drugs. ### Solutions: - **Model Architecture**: Chai-1 employs a multimodal neural network architecture that combines multiple sequence alignments and protein language model embeddings to capture evolutionary information and sequence features. - **Experimental Constraints**: Features such as pocket, contact, and docking constraints are introduced to simulate experimental data and enhance prediction accuracy. - **Single Sequence Mode**: Chai-1 maintains high prediction performance even without MSA, thanks to its robust single-sequence capabilities. - **High Performance**: Chai-1 outperforms existing methods like AlphaFold3 and RoseTTAFold in various benchmark tests. ### Summary: Chai-1 aims to improve the accuracy and efficiency of biomolecular structure prediction through advanced deep learning techniques, particularly in the fields of drug discovery and immunotherapy. The model not only excels in various tasks but also offers flexible experimental constraint features, making it more practical for real-world applications.