Actionable predictions of human pharmacokinetics at the drug design stage

Leonid Komissarov,Nenad Manevski,Katrin Groebke Zbinden,Torsten Schindler,Marinka Zitnik,Lisa Sach-Peltason
DOI: https://doi.org/10.26434/chemrxiv-2024-vgbxq
2024-03-06
Abstract:We present a novel computational approach for predicting human pharmacokinetics (PK) that addresses the challenges of early-stage drug design. Our study introduces and describes a large-scale dataset of 11 clinical PK endpoints, encompassing over 2700 unique chemical structures to train machine learning models. To that end multiple advanced training strategies are compared, including the integration of in vitro data and a novel self-supervised pre-training task. In addition to the predictions, our final model provides meaningful epistemic uncertainties for every data point. This allows us to successfully identify regions of exceptional predictive performance, with an Absolute Average Fold Error (AAFE/GMFE) of less than 2.5 across multiple endpoints. These advancements represent a significant leap towards actionable PK predictions, which can be utilized early on in the drug design process to expedite development and reduce reliance on nonclinical studies.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to effectively predict human pharmacokinetics (PK) in the early stage of drug design. Specifically, the authors propose a new computational method, aiming to overcome the limitations of existing methods in terms of data scale, single - attribute prediction, and model performance evaluation. By constructing a large - scale dataset containing more than 2,700 unique chemical structures and training machine - learning models to predict 11 clinical PK endpoints, the authors not only improve the prediction accuracy but also provide meaningful uncertainty estimates for each data point. These improvements make the prediction results more reliable, can provide valuable guidance in the early stage of drug design, and thus accelerate the drug - development process and reduce the dependence on non - clinical studies. ### Main Objectives: 1. **Improve Prediction Accuracy**: Improve the accuracy of human pharmacokinetics (PK) prediction by introducing large - scale datasets and advanced training strategies. 2. **Provide Uncertainty Estimates**: Provide meaningful uncertainty estimates for each prediction result to help identify areas with particularly good prediction performance. 3. **Multi - task Prediction**: Predict multiple PK attributes simultaneously to make full use of the information between related attributes. 4. **Integrate Multiple Data Sources**: Combine in - vitro data and pre - clinical in - vivo data to enhance the generalization ability of the model. 5. **Self - supervised Pre - training**: Use self - supervised learning methods to pre - train the model to improve its performance in downstream tasks. ### Problems Solved: - **Small Dataset Scale**: Existing human PK datasets are small in scale and usually only focus on a single attribute. - **Ignoring Related Information**: Single - attribute prediction ignores related attributes, PK results of other species, and corresponding in - vitro data. - **Incomplete Model Performance Evaluation**: Existing model performance evaluation mainly focuses on the average prediction quality, but may not represent the prediction performance of specific compounds. Through the above methods, the paper aims to provide a practical and operable PK prediction tool that can provide reliable guidance in the early stage of drug design, thereby accelerating the drug - development process.