Machine learning driven bioequivalence risk assessment at an early stage of generic drug development

Dejan Krajcar,Dejan Velušček,Iztok Grabnar
DOI: https://doi.org/10.1016/j.ejpb.2024.114553
IF: 5.589
2024-10-29
European Journal of Pharmaceutics and Biopharmaceutics
Abstract:Background Bioequivalence risk assessment as an extension of quality risk management lacks examples of quantitative approaches to risk assessment at an early stage of generic drug development. The aim of our study was to develop a model-based approach for bioequivalence risk assessment that uses pharmacokinetic and physicochemical characteristics of drugs as predictors and would standardize the first step of risk assessment. Methods The Sandoz in-house bioequivalence database of 128 bioequivalence studies with poorly soluble drugs (23.5% non-bioequivalent) was used to train and validate the model. Four different modeling approaches, random forest, XGBoost, logistic regression and naïve Bayes, were compared. Results Among the best performing machine learning models, random forest was selected and optimized for the number of features, resulting in an accuracy of 84% on the test data set. The most important features for prediction were those related to solubility (dose number, acid dissociation constant), absorption and elimination rate, effective permeability, variability of pharmacokinetic endpoints, and absolute bioavailability. All features had a conceivable influence on the model predictions. Conclusion The model was used to develop a bioequivalence risk assessment approach to categorize drugs in early development into high, medium or low risk classes.
pharmacology & pharmacy
What problem does this paper attempt to address?