Gradient Oriented Active Learning for Candidate Drug Design

Venkatesh Medabalimi
DOI: https://doi.org/10.1101/2024.07.11.603160
2024-07-15
Abstract:One of the primary challenges of drug design is that the complexity of Biology often comes to the fore only when proposed candidates are eventually tested in reality. This necessitates making the discovery process more efficient by making it actively seek what it wants to know of reality. We propose Gradient Oriented Active Learning (GOAL), a technique for optimizing sequence design through active exploration of sequence space that interleaves performing experiments and learning models that propose experiments for the next iteration through gradient based descent in the sequence space. We demonstrate the promise of this method using the challenge of mRNA design as our guiding example. Using computational methods as a surrogate for experimental data, we provide evidence that for certain objectives, if one were restricted by the bandwidth or the number of experiments they can perform in parallel, increasing the number of iterations can still facilitate optimization using very few experiments in total. We show that availability of high-throughput experiments can considerably bring down the number of iterations required. We further investigate the intricacies of performing multi-objective optimization using GOAL.
Synthetic Biology
What problem does this paper attempt to address?
The paper primarily addresses a core challenge in drug design: the complexity of biology often only becomes apparent when the proposed drug candidates are tested in practice. To improve the efficiency of the drug discovery process, the authors propose a technique called Gradient Oriented Active Learning (GOAL). The main objectives of GOAL are to optimize sequence design through active exploration of the sequence space, combining experiments with a gradient descent-based learning model. Specifically, the GOAL technique aims to address the following key issues: 1. **Improving the efficiency of drug discovery**: By using active learning, more effectively gather information during the drug design process to accelerate the discovery process. 2. **mRNA design challenges**: The paper uses mRNA design as a guiding example to demonstrate how the GOAL method can optimize mRNA sequences to meet various requirements, such as thermal stability, low toxicity, and high translation efficiency. 3. **Reducing experimental costs**: By iteratively combining computational predictions and experimental validation, GOAL can find near-optimal sequences with limited experimental resources. 4. **Multi-objective optimization**: The study explores how to balance multiple competing objectives (e.g., minimum free energy, number of unpaired bases) to find the optimal mRNA sequence. 5. **Sample complexity**: It discusses the advantages of using GOAL over static methods that rely solely on random sampling in terms of optimizing sample complexity. 6. **Role of high-throughput experiments**: It analyzes how high-throughput experiments can significantly reduce the number of iterations needed to achieve optimization. In summary, the paper aims to propose a new method for drug design, with a particular focus on mRNA design, and demonstrates the effectiveness and practicality of this method through experiments. By combining computational predictions and experimental validation, GOAL can more efficiently explore the sequence space, thereby accelerating the drug discovery process.