Predicting protein retention in ion‐exchange chromatography using an open source QSPR workflow

Tim Neijenhuis,Olivier Le Bussy,Geoffroy Geldhof,Marieke E. Klijn,Marcel Ottens
DOI: https://doi.org/10.1002/biot.202300708
2024-03-15
Biotechnology Journal
Abstract:Graphical and Lay Summary Protein retention during a chromatographic step is affected by physicochemical properties of both protein and resin. In this study, features are calculated from 3D protein structures using ProDes. For the prediction of protein retention in cation‐ and anion‐exchange chromatography, quantitative structure property models were trained with accuracies of R2 > 0.87. Such models increase understanding, thereby allowing for rational process design. Protein‐based biopharmaceuticals require high purity before final formulation to ensure product safety, making process development time consuming. Implementation of computational approaches at the initial stages of process development offers a significant reduction in development efforts. By preselecting process conditions, experimental screening can be limited to only a subset. One such computational selection approach is the application of Quantitative Structure Property Relationship (QSPR) models that describe the properties exploited during purification. This work presents a novel open‐source Python tool capable of extracting a range of features from protein 3D models on a local computer allowing total transparency of the calculations. As open‐source tool, it also impacts initial investments in constructing a QSPR workflow for protein property prediction for third parties, making it widely applicable within the field of bioprocess development. The focus of current calculated molecular features is projection onto the protein surface by constructing surface grid representations. Linear regression models were trained with the calculated features to predict chromatographic retention times/volumes. Model validation shows a high accuracy for anion and cation exchange chromatography data (cross‐validated R2 of 0.87 and 0.95). Hence, these models demonstrate the potential of the use of QSPR to accelerate process design.
biotechnology & applied microbiology,biochemical research methods
What problem does this paper attempt to address?