Generative AI-Driven Molecular Design: Combining Predictive Models and Reinforcement Learning for Tailored Molecule Generation

Teslim Olayiwola,Miriam Nnadili,Jose Romagnoli,Andrew Okafor,David Akinpelu
DOI: https://doi.org/10.26434/chemrxiv-2023-wcrv3
2023-11-17
Abstract:Molecular design is a critical aspect of various scientific and industrial fields, where the properties of molecules hold significant importance. In this study, a three-fold methodology design is presented that leverages the power of generative artificial intelligence (AI), predictive modeling, and reinforcement learning to create tailored molecules with desired properties. This model synergistically combines deep learning techniques with Self-Referencing Embedded Strings (SELFIES) molecular representation to build a generative model which generates valid molecules and a graphical neural network model that accurately forecasts molecular properties. The Variational Autoencoder (VAE) coupled with reinforcement learning, helps refine molecule generation based on targeted attributes. Data from an experimental study involving surfactants was used to test the framework. Saliency maps for the generated surfactants were produced to identify the features explaining the property values. The results showed that the proposed framework can effectively produce valid molecules within the set property threshold value. This approach not only streamlines molecular design for surfactant systems but also augurs transformative advancements across different scientific and industrial landscapes.
Chemistry
What problem does this paper attempt to address?
The paper aims to address the problem of how to use artificial intelligence (AI) to generate molecules with specific attributes. The research proposes a triple approach that combines predictive models, reinforcement learning, and generative AI to create customized molecules. Specifically, it uses deep learning techniques along with SELFIES (a string representation that guarantees the generation of valid molecules) to build a generative model, and predicts molecular properties through variational autoencoders (VAE) and graph neural networks (GNN). Molecular generation is optimized based on target attributes through reinforcement learning. The experiments demonstrate that this framework can effectively generate valid molecules within set thresholds, particularly in designing surfactant molecules with low critical micelle concentration (CMC). This approach not only simplifies the process of designing specific molecules but also indicates potential transformations in various scientific and industrial fields.