Evolutionary Algorithms Simulating Molecular Evolution: A New Field Proposal

James S. L. Browning Jr.,Daniel R. Tauritz,John Beckmann
2024-06-11
Abstract:The genetic blueprint for the essential functions of life is encoded in DNA, which is translated into proteins -- the engines driving most of our metabolic processes. Recent advancements in genome sequencing have unveiled a vast diversity of protein families, but compared to the massive search space of all possible amino acid sequences, the set of known functional families is minimal. One could say nature has a limited protein "vocabulary." The major question for computational biologists, therefore, is whether this vocabulary can be expanded to include useful proteins that went extinct long ago, or maybe never evolved in the first place. We outline a computational approach to solving this problem. By merging evolutionary algorithms, machine learning (ML), and bioinformatics, we can facilitate the development of completely novel proteins which have never existed before. We envision this work forming a new sub-field of computational evolution we dub evolutionary algorithms simulating molecular evolution (EASME).
Neural and Evolutionary Computing,Artificial Intelligence
What problem does this paper attempt to address?
The problem addressed in this paper is how to use evolutionary algorithms to simulate molecular evolution to expand the collection of known functional proteins and explore potential proteins with new functions. The paper mentions that despite the wide variety of proteins in nature, this collection is very limited compared to all possible combinations of amino acid sequences. The researchers suggest developing a new AI framework called Evolutionary Algorithm Simulation of Molecular Evolution (EASME) by combining evolutionary algorithms, machine learning, and bioinformatics to efficiently search the protein design space. They also propose biotechnological synthesis and screening methods to verify these search results. If successful, this approach has almost infinite applications in the field of biotechnology, such as optimizing key enzymes in photosynthesis to improve global agricultural production. The paper further discusses the previous work on EASME and the limitations of machine learning in understanding complex biological problems such as protein folding. The authors believe that evolutionary algorithms have an advantage in revealing the principles behind things and can generate solutions that are easy to understand rather than just imitating existing biological processes. They plan to use the EASME algorithm to explore and create new functional proteins that can be experimentally validated and potentially used in practical applications such as agriculture or drug development.