GraphXForm: Graph transformer for computer-aided molecular design with application to extraction

Jonathan Pirnay,Jan G. Rittig,Alexander B. Wolf,Martin Grohe,Jakob Burger,Alexander Mitsos,Dominik G. Grimm
2024-11-04
Abstract:Generative deep learning has become pivotal in molecular design for drug discovery and materials science. A widely used paradigm is to pretrain neural networks on string representations of molecules and fine-tune them using reinforcement learning on specific objectives. However, string-based models face challenges in ensuring chemical validity and enforcing structural constraints like the presence of specific substructures. We propose to instead combine graph-based molecular representations, which can naturally ensure chemical validity, with transformer architectures, which are highly expressive and capable of modeling long-range dependencies between atoms. Our approach iteratively modifies a molecular graph by adding atoms and bonds, which ensures chemical validity and facilitates the incorporation of structural constraints. We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned using a new training algorithm that combines elements of the deep cross-entropy method with self-improvement learning from language modeling, allowing stable fine-tuning of deep transformers with many layers. We evaluate GraphXForm on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques, while it can flexibly enforce structural constraints or initiate the design from existing molecular structures.
Machine Learning,Chemical Physics,Biomolecules
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key challenges in molecular design, particularly in the fields of drug discovery and materials science. Specifically, the authors propose a new approach to overcome the difficulties encountered by existing methods in generating chemically valid molecular structures. The main objectives of the paper are as follows: 1. **Ensuring Chemical Validity**: - Existing string-based methods (such as SMILES) often fail to ensure chemical validity when generating molecules, for example, by violating valence rules or producing invalid chemical structures. The authors propose using graph representations to directly manipulate molecular graphs, thereby naturally ensuring chemical validity. 2. **Handling Long-Range Dependencies**: - The transformer architecture is widely used in language models due to its ability to efficiently model long-range dependencies. The authors aim to bring this capability into molecular design to better handle the complex interactions between atoms in a molecule. 3. **Flexibly Imposing Structural Constraints**: - In molecular design, it is often necessary to impose specific structural constraints, such as a minimum number of certain atom types, bond restrictions, or the presence of specific substructures. Existing methods, especially string-based ones, struggle in this regard. The authors' proposed method can easily impose these constraints through graph operations. 4. **Improving the Quality of Generated Molecules**: - The authors aim to develop a design method capable of generating high-quality molecules and outperforming existing state-of-the-art techniques on specific tasks. To this end, they propose GraphXForm, a graph-based transformer architecture that constructs molecular graphs by iteratively adding atoms and bonds. 5. **Stabilizing the Training of Deep Transformers**: - Training deep transformers requires substantial resources, especially when fine-tuning with reinforcement learning. The authors propose a new training algorithm that combines deep cross-entropy methods and self-improvement learning to achieve stable and efficient fine-tuning of deep transformers. ### Summary The main contribution of the paper is the proposal of a new molecular design method—GraphXForm. By combining graph representations and transformer architecture, this method can generate chemically valid molecules while flexibly imposing structural constraints and performing well on specific tasks. Additionally, the method achieves stable fine-tuning of deep transformers through a new training algorithm.