Abstract:Immune checkpoint inhibitors have garnered significant attention in oncological research over recent years. A plethora of studies have elucidated that inhibitors targeting the Programmed Death-Ligand 1 (PD-L1) play a pivotal role in circumventing the evasion mechanisms of cancer cells against the immune system. This study aimed to develop an integrated screening model combining an Artificial Neural Network (ANN), Molecular Similarity (MS) assessments, and GNINA 1.0 molecular docking, targeting PD-L1 inhibitors. A database of 2044 substances with known PD-L1 inhibitory activity was compiled from Google Patents and used to enhance molecular similarity evaluations and train the machine learning model. For retrospective validation of the docking procedure, the human PD-L1 protein, with the Protein Data Bank (PDB) ID: 5N2F, was employed as a control. In this phase of the study, 15,235 compounds from the DrugBank database were subjected to a series of screening processes: initially through medicinal chemistry filters, followed by MS assessments, the ANN model, and culminating with molecular docking using GNINA 1.0. The decoy generation yielded promising outcomes, evidenced by an AUC-ROC 1NN value of 0.52 and Doppelganger scores with a mean of 0.24 and a maximum of 0.346, indicating a high resemblance of the decoys to the active set. For MS, the AVALON emerged as the most effective fingerprint for similarity searching, demonstrating an Enrichment Factor (EF) of 1% at 10.96%, an AUC-ROC of 0.963, and an optimal similarity threshold of 0.32. The ANN model demonstrated superior performance in cross-validation, achieving an average precision of 0.863±0.032 and an F1 score of 0.745±0.039, outperforming both the Support Vector Classifier (SVC) and Random Forest (RF) models, albeit not significantly. In external validation, the ANN model maintained its superiority with an average precision of 0.851 and an F1 score of 0.790. GNINA 1.0, employed for molecular docking, was validated through redocking and retrospective control, achieving an AUC of 0.975, with a critical cnn_pose_score threshold of 0.73. From the initial 15,235 compounds, 128 were shortlisted using the MS and ANN models. Further screening through GNINA 1.0 identified 22 potential candidates, among which (3S)-1-(4-acetylphenyl)-5-oxopyrrolidine-3-carboxylic acid emerged as the most promising, with a cnn_pose_score of 0.79, a PD-L1 inhibitory probability of 70.5%, and a Tanimoto coefficient of 0.35.

Small dataset solves big problem: An outlier-insensitive binary classifier for inhibitory potency prediction

Transfer inhibitory potency prediction to binary classification: A model only needs a small training set

Prediction of Inhibitory Activity Against the MATE1 Transporter via Combined Fingerprint- and Physics-Based Machine Learning Models

SAE-SV: A Stacked-AutoEncoder and Soft Voting Joint Approach Based on Small Dataset with High Dimensions for Inhibitory Potency Prediction

Consensus models for CDK5 inhibitors in silico and their application to inhibitor discovery

An Interpretable Multitask Framework BiLAT Enables Accurate Prediction of Cyclin-Dependent Protein Kinase Inhibitors

Establishment of extensive artificial intelligence models for kinase inhibitor prediction: Identification of novel PDGFRB inhibitors

MinKLIFSAI: a simple machine learning approach toward selective kinase inhibitor

Machine Learning-Enabled Pipeline for Large-Scale Virtual Drug Screening

A Comparative Study of SMILES-Based Supervised Machine Learning Models

Docking-informed machine learning for kinome wide affinity prediction

Innovative Virtual Screening of PD-L1 Inhibitors: The Synergy of Molecular Similarity, Neural Networks, and GNINA Docking

A deep learning based multi-model approach for predicting drug-like chemical compound's toxicity

Prediction of Small Molecule Kinase Inhibitors for Chemotherapy Using Deep Learning

Semi-Supervised Learning to Boost Cardiotoxicity Prediction by Mining a Large Unlabeled Small Molecule Dataset

Novel Big Data-Driven Machine Learning Models for Drug Discovery Application

Testing the predictive power of reverse screening to infer drug targets, with the help of machine learning

An Innovative Multi-Omics Model Integrating Latent Alignment and Attention Mechanism for Drug Response Prediction

Analysis of protein features and machine learning algorithms for prediction of druggable proteins

Machine learning-based classification models for non-covalent Bruton's tyrosine kinase inhibitors: predictive ability and interpretability

Bridging the gap between target-based and cell-based drug discovery with a graph generative multi-task model