Human-level molecular optimization driven by mol-gene evolution

Jiebin Fang,Churu Mao,Yuchen Zhu,Xiaoming Chen,Chang-Yu Hsieh,Zhongjun Ma
2024-06-13
Abstract:De novo molecule generation allows the search for more drug-like hits across a vast chemical space. However, lead optimization is still required, and the process of optimizing molecular structures faces the challenge of balancing structural novelty with pharmacological properties. This study introduces the Deep Genetic Molecular Modification Algorithm (DGMM), which brings structure modification to the level of medicinal chemists. A discrete variational autoencoder (D-VAE) is used in DGMM to encode molecules as quantization code, mol-gene, which incorporates deep learning into genetic algorithms for flexible structural optimization. The mol-gene allows for the discovery of pharmacologically similar but structurally distinct compounds, and reveals the trade-offs of structural optimization in drug discovery. We demonstrate the effectiveness of the DGMM in several applications.
Machine Learning,Artificial Intelligence,Neural and Evolutionary Computing,Chemical Physics,Biomolecules
What problem does this paper attempt to address?
This paper aims to solve the problem of structure optimization in drug molecule design, especially how to balance the relationship between the novelty of molecular structure and pharmacological properties in the process of new drug development. The traditional drug discovery process usually goes through multiple stages, from the initial screening of active compounds to obtaining lead compounds with excellent pharmacological properties through structural modification. However, this process is complex and time - consuming, requiring a large amount of human and material resources. Especially in the structural modification stage, medicinal chemistry experts are required to design and select molecular structures based on experience. To meet these challenges, this paper proposes an algorithm named Deep Genetic Molecular Modification Algorithm (DGMM). By combining deep learning and genetic algorithms, this algorithm can flexibly optimize molecular structures. Specifically, DGMM uses a discrete variational auto - encoder (D - VAE) to encode molecules into quantized codes, namely "mol - gene", which enables deep - learning techniques to be incorporated into genetic algorithms for the optimization of molecular structures. This method can not only discover pharmacologically similar but structurally different compounds, but also reveal the trade - off problems in structure optimization during the drug discovery process. By introducing the concept of "mol - gene", DGMM can explore a wider chemical space while maintaining the pharmacological properties of molecules, thereby discovering new potential drug molecules. In addition, this study also demonstrates the application effect of DGMM in the actual drug discovery process and verifies the effectiveness of the generated molecules. In short, DGMM provides a novel and effective strategy to accelerate the optimization process of drug molecules, which is of great significance for improving the efficiency of drug research and development.