Top-down design of protein nanomaterials with reinforcement learning
Lutz,I. D.,Wang,S.,Norn,C.,Borst,A. J.,Zhao,Y. T.,Dosey,A.,Cao,L.,Li,Z.,Baek,M.,King,N. P.,Ruohola-Baker,H.,Baker,D.
DOI: https://doi.org/10.1101/2022.09.25.509419
2022-09-26
bioRxiv
Abstract:The multisubunit protein assemblies that play critical roles in biology are the result of evolutionary selection for function of the entire assembly, and hence the subunits in structures such as icosahedral viral capsids often fit together with remarkable shape complementarity1,2. In contrast, the large multisubunit assemblies that have been created by de novo protein design, notably the icosahedral nanocages used in a new generation of potent vaccines3-7, have been built by first designing symmetric oligomers with cyclic symmetry and then assembling these into nanocages while keeping the internal structure fixed8-14, which results in more porous structures with less extensive shape matching between the components. Such hierarchical "bottom-up" design approaches have the advantage that one interface can be designed and validated in the context of the cyclic oligomer building block15,16, but the disadvantage that the structural and functional features of the assemblies are limited by the properties of the predesigned building blocks. To overcome this limitation, we set out to develop a "top-down" reinforcement learning based approach to protein nanomaterial design in which both the structures of the subunits and the interactions between them are built up coordinately in the context of the entire assembly. We developed a Monte Carlo tree search (MCTS) method17,18 which assembles protein monomer structures in the context of an overall architecture guided by a loss function which enables specification of any desired overall structural properties such as shape and porosity. We demonstrate the power of the approach by designing hyperstable icosahedral assemblies more compact than any previously observed protein icosahedral structure (designed or naturally occurring), that have very low porosity and are robust to fusion and display of proteins as complex as influenza hemagglutinin. CryoEM structures of two designs are very close to the computational design models. Our top-down reinforcement learning approach should enable the design of a wide variety of complex protein nanomaterials by direct optimization of overall system properties.
English Else