How the technologies behind self‐driving cars, social networks, ChatGPT, and DALL‐E2 are changing structural biology
Matthias Bochtler
DOI: https://doi.org/10.1002/bies.202400155
2024-10-17
BioEssays
Abstract:Deep learning, a brain inspired field of computer science, revolutionizes structural biology. I summarize how convolutional neural networks (CNNs), large language models (LLMs), denoising diffusion probabilistic models (DDPMs)/Noise conditional score networks (NCSNs), and graph neural networks (GNNs) have impacted protein structure prediction, inverse folding, protein design, and small molecule design. The performance of deep Neural Networks (NNs) in the text (ChatGPT) and image (DALL‐E2) domains has attracted worldwide attention. Convolutional NNs (CNNs), Large Language Models (LLMs), Denoising Diffusion Probabilistic Models (DDPMs)/Noise Conditional Score Networks (NCSNs), and Graph NNs (GNNs) have impacted computer vision, language editing and translation, automated conversation, image generation, and social network management. Proteins can be viewed as texts written with the alphabet of amino acids, as images, or as graphs of interacting residues. Each of these perspectives suggests the use of tools from a different area of deep learning for protein structural biology. Here, I review how CNNs, LLMs, DDPMs/NCSNs, and GNNs have led to major advances in protein structure prediction, inverse folding, protein design, and small molecule design. This review is primarily intended as a deep learning primer for practicing experimental structural biologists. However, extensive references to the deep learning literature should also make it relevant to readers who have a background in machine learning, physics or statistics, and an interest in protein structural biology.
biochemistry & molecular biology,biology