Nc-vae: normalised conditional diverse variational autoencoder guided de novo molecule generation

DOI: https://doi.org/10.1007/s11227-024-06250-2
IF: 3.3
2024-06-07
The Journal of Supercomputing
Abstract:This work proposes a novel approach for drug molecule design using data-assisted techniques. This approach leverages a generation-based framework to expedite the drug discovery process, aiming to identify candidate molecules suitable for production while minimizing development timelines and regulatory hurdles. The core of the proposed method is a conditional variational autoencoder (CVAE) for molecule generation, employing NCSMILES string representation. The framework involves three key stages: (1) molecule generation using the CVAE, (2) filtering based on a scoring function, and (3) identification of the optimal molecule from the generated pool. To enhance the latent space representation, we incorporate molecule properties alongside conditional selection criteria. The performance of the proposed scheme is comprehensively evaluated on standard benchmark datasets using various metrics, including validity, diversity, usefulness, and novelty. The method demonstrates superior performance compared to existing state-of-the-art approaches, attributable to several key improvements, including intermediary optimizations and condition-based selection.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?