Discovery of VEGFR-2 Inhibitors employing Junction Tree Variational Encoder with Local Latent Space Bayesian Optimization and Gradient Ascent Exploration

Tuyen Ngoc Truong,Thanh-An Pham,Van-Thinh To,Hoang-Son Lai Le,Phuoc-Chung Van Nguyen,The-Chuong Trinh,Tieu-Long Phan,Gia-Bao Truong
DOI: https://doi.org/10.26434/chemrxiv-2024-18mqh
2024-07-15
Abstract:(VEGFR-2), which belongs to the protein tyrosine kinase family, emerges as one of the most significant targets of interest. The ongoing Food and Drug Administration (FDA) approval of novel therapeutic medicines towards VEGFR-2 emphasizes the urgent need to discover sophisticated molecular structures that are capable of reliably limiting VEGFR-2 activity. Recognizing the huge potential of deep learning-based molecular model advancements, we focused our study on exploring the chemical space to find small molecules potentially inhibiting VEGFR-2. To achieve this goal, we utilized the Junction Tree Variational Autoencoder in combination with two optimization approaches on the latent space: the local Bayesian optimization on the initial dataset and the gradient ascent on nine FDA-approved drugs targeting VEGFR-2. The optimization results yielded a set of 493 uncharted small molecules. Quantitative structure-activity relationship (QSAR) models and molecular docking were used to assess the generated molecules for their inhibitory potential using their predicted pIC50 and binding affinity. The QSAR model constructed on RDK7 fingerprints using the CatBoost algorithm achieved remarkable coefficients of determination (R2) of 0.792 ± 0.075 and 0.859 with respect to internal and external validation. Molecular docking was implemented using the 4ASD complex with optimistic retrospective control results (the ROC-AUC value being 0.710 and the binding activity threshold being -7.90 kcal/mol). Newly generated molecules possessing acceptable results corresponding to both assessments were shortlisted and checked for interactions with the protein at the binding site on important residues, including Cys919, Asp1046, and Glu885
Chemistry
What problem does this paper attempt to address?
This paper mainly discusses the problem of discovering inhibitors of vascular endothelial growth factor receptor 2 (VEGFR-2). In the development of anti-cancer drugs, VEGFR-2, as a member of the tyrosine kinase receptor family, is an important target. The researchers used deep learning techniques, especially Junction Tree Variational Autoencoder (JTVAE), combined with local Bayesian optimization and gradient ascent exploration methods, to search for potential small molecules that can inhibit VEGFR-2 in the latent space. The researchers first trained the JTVAE model and optimized the chemical space through two optimization strategies: local Bayesian optimization on the initial dataset and gradient ascent exploration for 9 approved VEGFR-2 inhibitors. These optimizations generated 493 undiscovered small molecules. Then, they used quantitative structure-activity relationship (QSAR) models and molecular docking to evaluate the inhibitory potential of these generated molecules, predicting pIC50 values and binding affinity. The QSAR model was built based on the CatBoost algorithm, with internal and external validation coefficients of determination (R²) of 0.792±0.075 and 0.859, respectively. Molecular docking was performed using the 4ASD complex, yielding optimistic retrospective control results. The newly generated molecules were evaluated for potential inhibitory effects by interacting with key amino acid residues, such as Cys919, Asp1046, and Glu885, to screen for candidate molecules. In summary, the paper aims to explore the chemical space using deep learning and optimization techniques to discover new VEGFR-2 inhibitors, with the hope of providing more effective drug candidates for cancer treatment.