Crossover Operators for Molecular Graphs with an Application to Virtual Drug Screening

Nico Domschke,Bruno Schmidt,Thomas Gatter,Richard Golnik,Paul Eisenhuth,Fabian Liessmann,Jens Meiler,Peter F Stadler
DOI: https://doi.org/10.26434/chemrxiv-2024-41295
2024-09-17
Abstract:Genetic Algorithms are a powerful method to solve optimization problems with complex cost functions over vast search spaces that rely in particular on recombining parts of previous solutions. Crossover operators play a crucial role in this context. Here, we describe a large class of these operators designed for searching over spaces of graphs. These operators are based on introducing small cuts into graphs and rejoining the resulting induced subgraphs of two parents. This form of cut-and-join crossover can be restricted in a consistent way to preserve local properties such as vertex-degrees (valency), or bond-orders, as well as global properties such as graph-theoretic planarity. In contrast to crossover on strings, cut-and-join crossover on graphs is powerful enough to ergodically explore chemical space even in the absence of mutation operators. Extensive benchmarking shows that the offspring of molecular graphs are again plausible molecules with high probability, while at the same time crossover drastically increases the diversity compared to initial molecule libraries. Moreover, desirable properties such as favorable indices of synthesizability are preserved with sufficient frequency that candidate offsprings can be filtered efficiently for such properties. As an application we utilized the cut-and-join crossover in REvoLd, a GA-based system for computer-aided drug design. In optimization runs searching for ligands binding to four different target proteins we consistently found candidate molecules with binding constants exceeding the best known binders as well as candidates found in make-on-demand libraries. Taken together, cut-and-join crossover operators constitute a mathematically simple and well-characterized approach to recombination of molecules that performed very well in real-life CADD tasks.
Chemistry
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The main goal of this paper is to define and characterize a class of crossover operators for graphs, specifically targeting molecular graphs in the exploration of chemical space. Specifically, the authors propose a cut-and-join based method to perform crossover operations. This method generates new molecular graphs by introducing small cuts in two parent graphs and reconnecting the resulting induced subgraphs. #### Main Content and Contributions: 1. **Design of Crossover Operators**: The paper describes a class of crossover operators suitable for graphs, which are based on introducing small cuts in the graphs and reorganizing the induced subgraphs of the two parent graphs through connection operations. 2. **Preservation of Local and Global Properties**: The authors demonstrate how to preserve local properties (such as vertex degree and bond order) and global properties (such as planarity in graph theory) by adjusting the cut-and-join method. This ability to preserve properties is crucial for chemical applications. 3. **Exploration of Chemical Space**: Compared to traditional string-based genetic algorithms, graph-based crossover operators can effectively explore chemical space even without mutation operators. 4. **Application in Virtual Drug Screening**: The authors apply the cut-and-join crossover operators in the REvoLd system for computer-aided drug design. Experimental results show that this method can find candidate molecules with binding affinities far exceeding those of the best existing ligands. 5. **Theoretical Analysis**: The paper also provides theoretical analysis, proving that under certain conditions, it is possible to construct crossover offspring that maintain planarity in graph theory. In summary, this paper aims to develop a simple and mathematically understandable method for molecular recombination that performs well in practical computer-aided drug design tasks.