Learning-Based Estimation of Fitness Landscape Ruggedness for Directed Evolution

Sebastian Towers,Jessica James,Harrison Steel,Idris Kempf
DOI: https://doi.org/10.1101/2024.02.28.582468
2024-03-02
Abstract:Directed evolution is a method for engineering biological systems or components, such as proteins, wherein desired traits are optimised through iterative rounds of mutagenesis and selection of fit variants. The process of protein directed evolution can be envisaged as navigation over high-dimensional landscapes with numerous local maxima, mapping every possible variant of a protein to its fitness. The performance of any strategy in navigating such a landscape is dependent on several parameters, including its ruggedness. However, this information is generally unavailable at the outset of an experiment, and cannot be computed using analytical methods. Here we propose a learning-based method for estimating landscape ruggedness from a mutating population, using only population average performance data. This method uses a short period of exploration at the beginning of an experiment to predict the ruggedness, subsequently guiding the choice of high-performing parameters for directed evolution control. We then simulate this approach on two real-world protein fitness landscapes, demonstrating an improvement upon the performance of standard strategies, particularly on rugged landscapes. In addition to improving the overall outcomes of directed evolution, this method has the advantage of being readily deployable in laboratory settings, even in configurations that exclusively capture average population measures. Given the rapidly expanding application space of engineered proteins, the products of improved directed evolution are relevant in medicine, agriculture and manufacturing.
Bioengineering
What problem does this paper attempt to address?