Abstract:We present a novel computational approach for predicting human pharmacokinetics (PK) that addresses the challenges of early-stage drug design. Our study introduces and describes a large-scale dataset of 11 clinical PK endpoints, encompassing over 2700 unique chemical structures to train machine learning models. To that end multiple advanced training strategies are compared, including the integration of in vitro data and a novel self-supervised pre-training task. In addition to the predictions, our final model provides meaningful epistemic uncertainties for every data point. This allows us to successfully identify regions of exceptional predictive performance, with an Absolute Average Fold Error (AAFE/GMFE) of less than 2.5 across multiple endpoints. These advancements represent a significant leap towards actionable PK predictions, which can be utilized early on in the drug design process to expedite development and reduce reliance on nonclinical studies.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to effectively predict human pharmacokinetics (PK) in the early stage of drug design. Specifically, the authors propose a new computational method, aiming to overcome the limitations of existing methods in terms of data scale, single - attribute prediction, and model performance evaluation. By constructing a large - scale dataset containing more than 2,700 unique chemical structures and training machine - learning models to predict 11 clinical PK endpoints, the authors not only improve the prediction accuracy but also provide meaningful uncertainty estimates for each data point. These improvements make the prediction results more reliable, can provide valuable guidance in the early stage of drug design, and thus accelerate the drug - development process and reduce the dependence on non - clinical studies. ### Main Objectives: 1. **Improve Prediction Accuracy**: Improve the accuracy of human pharmacokinetics (PK) prediction by introducing large - scale datasets and advanced training strategies. 2. **Provide Uncertainty Estimates**: Provide meaningful uncertainty estimates for each prediction result to help identify areas with particularly good prediction performance. 3. **Multi - task Prediction**: Predict multiple PK attributes simultaneously to make full use of the information between related attributes. 4. **Integrate Multiple Data Sources**: Combine in - vitro data and pre - clinical in - vivo data to enhance the generalization ability of the model. 5. **Self - supervised Pre - training**: Use self - supervised learning methods to pre - train the model to improve its performance in downstream tasks. ### Problems Solved: - **Small Dataset Scale**: Existing human PK datasets are small in scale and usually only focus on a single attribute. - **Ignoring Related Information**: Single - attribute prediction ignores related attributes, PK results of other species, and corresponding in - vitro data. - **Incomplete Model Performance Evaluation**: Existing model performance evaluation mainly focuses on the average prediction quality, but may not represent the prediction performance of specific compounds. Through the above methods, the paper aims to provide a practical and operable PK prediction tool that can provide reliable guidance in the early stage of drug design, thereby accelerating the drug - development process.

Actionable predictions of human pharmacokinetics at the drug design stage

Computational Predictions of Nonclinical Pharmacokinetics at the Drug Design Stage

Application of Ivive and Pbpk Modeling in Prospective Prediction of Clinical Pharmacokinetics: Strategy and Approach During the Drug Discovery Phase with Four Case Studies

Machine learning framework to predict pharmacokinetic profile of small molecule drugs based on chemical structure

PKSmart: An Open-Source Computational Model to Predict Pharmacokinetics of Small Molecules

Shared learning from a physiologically based pharmacokinetic modeling strategy for human pharmacokinetics prediction through retrospective analysis of Genentech compounds

Evaluation of Generic Methods to Predict Human Pharmacokinetics Using Physiologically Based Pharmacokinetic Model for Early Drug Discovery of Tyrosine Kinase Inhibitors

Evaluation of the Success of High-Throughput Physiologically Based Pharmacokinetic (HT-PBPK) Modeling Predictions to Inform Early Drug Discovery

Perspectives on the use of machine learning for ADME prediction at AstraZeneca

Estimating human ADME properties, pharmacokinetic parameters and likely clinical dose in drug discovery

Deep-PK: deep learning for small molecule pharmacokinetic and toxicity prediction

Predicting pharmacodynamic effects through early drug discovery with artificial intelligence-physiologically based pharmacokinetic (AI-PBPK) modelling

A Combination of Machine Learning and PBPK Modeling Approach for Pharmacokinetics Prediction of Small Molecules in Humans

An Adaptive Graph Learning Method for Automated Molecular Interactions and Properties Predictions

Multi-task ADME/PK Prediction at Industrial Scale: Leveraging Large and Diverse Experimental Datasets

PHRMA CPCDC initiative on predictive models of human pharmacokinetics, part 5: Prediction of plasma concentration–time profiles in human by using the physiologically‐based pharmacokinetic modeling approach

Can We Predict Clinical Pharmacokinetics of Highly Lipophilic Compounds by Integration of Machine Learning or In Vitro Data into Physiologically Based Models? A Feasibility Study Based on 12 Development Compounds

Artificial intelligence for compound pharmacokinetics prediction

A deep neural network: mechanistic hybrid model to predict pharmacokinetics in rat

A Perspective on the Prediction of Drug Pharmacokinetics and Disposition in Drug Research and Development

Systematic Evaluation of High-Throughput PBK Modelling Strategies for the Prediction of Intravenous and Oral Pharmacokinetics in Humans