POxload: Machine Learning Estimates Drug Loadings of Polymeric Micelles

Robert Luxenhofer,Josef Kehrein,Alex Bunker
DOI: https://doi.org/10.26434/chemrxiv-2024-l5kvc
2024-01-09
Abstract:Amphiphilic ABA-triblock copolymers, comprised of poly(2-oxazoline)s and poly(2-oxazine)s, can serve as drug delivery systems; they form micelles that carry poorly water-soluble drugs. Many recent studies have investigated the effect of structural changes of the polymer and the hydrophobic cargo on drug loading. In this work, we combine these data to establish an extended formulation database. Different molecular properties and fingerprints are tested for their applicability to serve as formulation-specific mixture descriptors. A variety of classification and regression models is built for different descriptor subsets and thresholds of loading efficiency and loading capacity, with the best models achieving overall good statistics for both cross- and external validation (balanced accuracies of 0.8). Subsequently, important features are dissected for interpretation and the DrugBank is screened for potential therapeutic use cases where these polymers could be used to develop novel formulations of hydrophobic drugs. The most promising models are provided as a software tool for other researchers to test the applicability of these delivery systems for potential new drug candidates.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the prediction of the loading capacity (LC) and loading efficiency (LE) in drug delivery systems (DDS), especially for the polymer micelles formed by amphiphilic ABA triblock copolymers. Since many drugs have poor water - solubility, this has brought huge challenges to the pharmaceutical industry. Although the development of nanotechnology in recent years has provided a variety of possible solutions, the insufficient understanding of the internal driving forces of these DDS has led to the fact that drug formulation development still mainly depends on time - consuming and resource - intensive experimental screening. Therefore, researchers hope to optimize the development process and reduce experimental costs and time by integrating complementary computer simulation methods. Specifically, this paper aims to collect and expand the existing experimental data sets and use machine - learning techniques to establish prediction models to predict the loading capacity and loading efficiency of different polymer and drug combinations. This model can not only help researchers evaluate whether new drug candidates are suitable for using these polymer micelles as delivery systems, but also provide a publicly available database by virtually screening known compounds, listing potential use cases of low - solubility drugs that may be suitable for these delivery systems. To achieve this goal, the authors combined the data from multiple previous publications and in - house laboratory data to create a formulation database containing 3,700 experimental data points. Based on this database, they tested the applicability of different molecular properties and fingerprints as formulation - specific mixing descriptors and constructed a variety of classification and regression models. Finally, they provided a software tool for other researchers to test the suitability of these delivery systems for potential new drug candidates.