Reduction of monoclonal antibody viscosity using interpretable machine learning

Emily K Makowski,Hsin-Ting Chen,Tiexin Wang,Lina Wu,Jie Huang,Marissa Mock,Patrick Underhill,Emma Pelegri-O'Day,Erick Maglalang,Dwight Winters,Peter M Tessier,Emily K. MakowskiHsin-Ting ChenTiexin WangLina WuJie HuangMarissa MockPatrick UnderhillEmma Pelegri-O'DayErick MaglalangDwight WintersPeter M. Tessiera Department of Pharmaceutical Sciences,University of Michigan,Ann Arbor,MI,USAb Biointerfaces Institute,University of Michigan,Ann Arbor,MI,USAc Department of Chemical Engineering,University of Michigan,Ann Arbor,MI,USAd Therapeutic Discovery,Research,Amgen Inc,Thousand Oaks,CA,USAe Department of Chemical and Biological Engineering,Rensselaer Polytechnic Institute,Troy,NY,USAf Drug Product Technologies,Amgen Inc,Thousand Oaks,CA,USAg Department of Biomedical Engineering,University of Michigan,Ann Arbor,MI,USA
DOI: https://doi.org/10.1080/19420862.2024.2303781
2024-03-14
mAbs
Abstract:Early identification of antibody candidates with drug-like properties is essential for simplifying the development of safe and effective antibody therapeutics. For subcutaneous administration, it is important to identify candidates with low self-association to enable their formulation at high concentration while maintaining low viscosity, opalescence, and aggregation. Here, we report an interpretable machine learning model for predicting antibody (IgG1) variants with low viscosity using only the sequences of their variable (Fv) regions. Our model was trained on antibody viscosity data (>100 mg/mL mAb concentration) obtained at a common formulation pH (pH 5.2), and it identifies three key Fv features of antibodies linked to viscosity, namely their isoelectric points, hydrophobic patch sizes, and numbers of negatively charged patches. Of the three features, most predicted antibodies at risk for high viscosity, including antibodies with diverse antibody germlines in our study (79 mAbs) as well as clinical-stage IgG1s (94 mAbs), are those with low Fv isoelectric points (Fv pIs < 6.3). Our model identifies viscous antibodies with relatively high accuracy not only in our training and test sets, but also for previously reported data. Importantly, we show that the interpretable nature of the model enables the design of mutations that significantly reduce antibody viscosity, which we confirmed experimentally. We expect that this approach can be readily integrated into the drug development process to reduce the need for experimental viscosity screening and improve the identification of antibody candidates with drug-like properties.
medicine, research & experimental
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to identify monoclonal antibodies with drug - like properties at an early stage, especially those suitable for subcutaneous administration and able to maintain low viscosity, transparency and low aggregation at high concentrations**. Specifically, the authors developed an interpretable machine - learning model to predict the viscosity of monoclonal antibodies (IgG1) and design mutations that can significantly reduce antibody viscosity by using only their variable region (Fv) sequences. ### Problem Background 1. **The Need for High - Concentration Antibody Formulations**: In order to simplify the subcutaneous administration of therapeutic antibodies, high - concentration antibody solutions need to be prepared. However, many antibodies exhibit problems such as high viscosity, opacity and aggregation at high concentrations, which will affect the stability of the drug and the administration effect. 2. **Limitations of Existing Methods**: - **Insufficient Data**: There is a lack of sufficient high - concentration antibody viscosity data, making it difficult to conduct reliable model training and testing. - **Complexity and Inaccessibility**: Some existing models require complex calculations or authorization and are difficult to be widely used. - **Difficult to Interpret**: Most machine - learning models are "black boxes", difficult to interpret their prediction results and not convenient for guiding the rational design of antibodies. - **Insufficient Validation**: Most models have not been validated with new mutations and can only predict unseen antibodies. ### Core Contributions of the Paper 1. **Large - Scale Data Set**: The currently largest set of high - concentration antibody viscosity measurement data sets (> 100 mg/mL) was used, including 62 antibodies for model training and 17 antibodies for testing. 2. **Simple and Interpretable Model**: A decision - tree - based classification model was developed, which can predict the viscosity level of antibodies only by requiring Fv amino acid sequences and homology modeling. The model is based on three key Fv features: - **Isoelectric Point (pI)**: Negatively correlated with viscosity. - **Hydrophobic Patch Size**: Positively correlated with viscosity. - **Number of Negative - Charge Patches**: Negatively correlated with viscosity. 3. **Experimental Verification**: It was experimentally verified that the new mutations predicted by the model can indeed significantly reduce the viscosity of antibodies. 4. **Potential for Clinical Application**: This model can be integrated into the drug development process, reducing the need for experimental viscosity screening and improving the efficiency of identifying antibody candidates with drug - like properties. ### Key Findings - **Influence of Isoelectric Point**: Antibodies with Fv isoelectric points lower than 6.3 are more likely to exhibit high viscosity. - **Influence of Hydrophobic Patch**: Larger hydrophobic patches will lead to higher viscosity. - **Influence of Negative - Charge Patch**: More negative - charge patches help to reduce viscosity. - **Four Types of Antibody Behaviors**: According to the Fv isoelectric point, hydrophobic patch size and number of negative - charge patches, antibodies are divided into four types (Type I, II, III, IV), and each type has different viscosity behaviors. ### Summary This paper successfully solved the problem of high - concentration monoclonal antibody viscosity prediction by developing an interpretable machine - learning model, and provided new tools and methods for antibody engineering. This result is expected to accelerate the development process of antibody drugs and improve the safety and effectiveness of subcutaneous administration.