Abstract:Abstract We have developed an algorithm and implemented it in a software platform for the purpose of developing new anti-tumor drugs in the form of small molecules. In this study, we focused on generating molecules specifically for the treatment of lung cancer patients. To begin with, we employed deep learning (DL) techniques to evaluate the genes associated with poor clinical outcomes in lung cancer patients. By utilizing generative adversarial neural networks (GAN), we acquired additional patient data. The results of each experiment were presented as a list of genes ordered by their impact on the desired effect. We then intersected the lists of genes obtained from experiments with overall survival (OS) and progression-free interval (PFI) data. This allowed us to identify a set of genes whose expression was correlated with poor prognosis. In order to enhance the precision, we trained another DL model to distinguish between normal and tumor tissue based on gene expression. By doing so, we were able to identify the smaller set of genes that could be targeted. Subsequently, we developed a module that predicts the interactions between inhibitors and proteins. This involved representing protein amino acid sequences and chemical compound formulas in vector form, and a virtual screening of the Pubchem database. The implementation of the Drug-protein interactions module resulted in a dataset of 118,379 pairs, including 19,250 pairs describing compounds bound to proteins, and 99,129 precedents describing non-bound ones. DLwas applied, yielding a ROC-AUC of 0.86. Following the search for candidate molecules, we obtained 160,000 pairs with a predicted interaction probability above 0.99, as well as 2,921 pairs with probability of 1.0. Additionally, we created a DL-based module to predict the IC50 values in cell line experiments. Virtual pre-clinical trials were conducted using the selected inhibitors to identify relevant cell lines for subsequent laboratory experiments. Through this process, we obtained formulas for several molecules that demonstrated predicted binding to specific proteins. During the cell experiment emulation, our feature importance algorithm selected 129 genes. For the cell experiment emulation stage, we specifically chose interactions with a probability of at least 0.9. We prioritized molecules that acted on the minimum number of cell lines with a higher probability, thus ensuring higher specificity. Ultimately, we selected 5 small molecules as potential candidates, as well as certain cell lines for their validation. The NLP technologies utilized in this study demonstrated their effectiveness in processing tens of thousands of articles. The pipeline of methods presented in this paper lays the groundwork for automated AI-driven drug discovery. We have showcased the application of modern machine learning methods, particularly DL, as well as the methods used to prepare the initial data for the learning algorithms. The performance of these methods has been validated through cross-validation using data from publicly available sources. Citation Format: Dmitrii K Chebanov, Vsevolod A Misyurin, Nadezhda S Tatevosova. Deep learning-driven drug discovery: A breakthrough algorithm and its implication in lung cancer therapy development [abstract]. In: Proceedings of the AACR-NCI-EORTC Virtual International Conference on Molecular Targets and Cancer Therapeutics; 2023 Oct 11-15; Boston, MA. Philadelphia (PA): AACR; Mol Cancer Ther 2023;22(12 Suppl):Abstract nr A014.

ChemPrint: An AI-Driven Framework for Enhanced Drug Discovery

Streamlining Computational Fragment-Based Drug Discovery Through Evolutionary Optimization Informed by Ligand-Based Virtual Prescreening

BioPrint meets the AI age: development of artificial intelligence-based ADMET models for the drug discovery platform SAFIRE

AI-driven drug discovery: identification and optimization of ALDH3A1 selective inhibitors with nanomolar activity

Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical Matter─Compounds with Novel Mechanisms from High-Throughput Screening Profiles

Practical Applications of Deep Learning To Impute Heterogeneous Drug Discovery Data

Optimizing Drug Design by Merging Generative AI With Active Learning Frameworks

Pharmacoprint -- a combination of pharmacophore fingerprint and artificial intelligence as a tool for computer-aided drug design

Abstract 3527: Application of a deep learning based drug sensitivity prediction model on a novel anticancer drug

SmartCADD: AI-QM Empowered Drug Discovery Platform with Explainability

Abstract A014: Deep learning-driven drug discovery: A breakthrough algorithm and its implication in lung cancer therapy development

Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery

Generative AI for Drug Discovery: A GPT-2 and LSTM Based Models for Designing EGFR Inhibitors

Efficient Exploration of Chemical Space with Docking and Deep Learning

DrugGen: Advancing Drug Discovery with Large Language Models and Reinforcement Learning Feedback

SynthFormer: Equivariant Pharmacophore-based Generation of Molecules for Ligand-Based Drug Design

A deep-learning view of chemical space designed to facilitate drug discovery

Data Valuation: A novel approach for analyzing high throughput screen data using machine learning

Cell Painting-based bioactivity prediction boosts high-throughput screening hit-rates and compound diversity

Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery

Bench to bedside: The ambitious goal of transducing medicinal chemistry from the lab to the clinic