Abstract:Drug‐like molecules are found by either expanding the set of molecules with known properties by selecting molecules from database and performing calculations on them or exploring the drug‐like chemical space by generating them from scratch using generative models. This review article highlights modern machine learning based methods that try to efficiently enhance our sampling capability beyond conventional screening methods. Drug design involves the process of identifying and designing novel molecules that have desirable properties and bind well to a given target receptor. Typically, such molecules are identified by screening large chemical libraries for desirable physicochemical properties and binding strength with the target protein. This traditional approach, however, has severe limitations as exhaustively screening every molecule in known chemical libraries is computationally infeasible. Furthermore, currently available molecular libraries are only a minuscule part of the entire set of possible drug‐like molecular structures (drug‐like chemical space). In this review, we discuss how the former limitation is addressed by modeling virtual screening as a search space problem and how these endeavors utilize machine learning to reduce the number of required computational experiments to identify top candidates. We follow that up by discussing generative methods that attempt to approximate the entire drug‐like chemical space providing us a path to explore beyond the known drug‐like chemical space. We place special emphasis on generative models that learn the marginal distributions conditioned on specific properties or receptor structures for efficient sampling of molecules. Through this review, we aim to highlight modern machine learning based methods that try to efficiently enhance our sampling capability beyond conventional screening methods which, in turn, would benefit drug design significantly. Therefore, we also encourage further methods of development that work on such important aspects of drug design. This article is categorized under: Data Science > Chemoinformatics Data Science > Artificial Intelligence/Machine Learning Data Science > Computer Algorithms and Programming

Enhanced Sampling of Chemical Space for High Throughput Screening Applications using Machine Learning

Streamlining Computational Fragment-Based Drug Discovery Through Evolutionary Optimization Informed by Ligand-Based Virtual Prescreening

Efficient and enhanced sampling of drug‐like chemical space for virtual screening and molecular design using modern machine learning methods

Efficient Exploration of Chemical Space with Docking and Deep Learning

Thompson Sampling─An Efficient Method for Searching Ultralarge Synthesis on Demand Databases

A Mechanism to Open Academic Chemistry to High-Throughput Virtual Screening

Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking

Controlled exploration of chemical space by machine learning of coarse-grained representations

Novel Big Data-Driven Machine Learning Models for Drug Discovery Application

Emerging structure-based computational methods to screen the exploding accessible chemical space

Scalable Partitioning and Exploration of Chemical Spaces Using Geometric Hashing

Synthon-based ligand discovery in virtual libraries of over 11 billion compounds

A deep-learning view of chemical space designed to facilitate drug discovery

Structure-Based Virtual Screening of Chemical Libraries for Drug Discovery

The Pan-Canadian Chemical Library: A Mechanism to Open Academic Chemistry to High-Throughput Virtual Screening

Augmenting Hit Identification by Virtual Screening Techniques in Small Molecule Drug Discovery

Data-driven approaches used for compound library design, hit triage and bioactivity modeling in high-throughput screening

Application of QSAR and shape pharmacophore modeling approaches for targeted chemical library design

Evaluating Scalable Supervised Learning for Synthesize-on-Demand Chemical Libraries

Virtual Screening of Molecules via Neural Fingerprint-based Deep Learning Technique

Pareto Optimization to Accelerate Multi-Objective Virtual Screening