Abstract:Drug discovery aims to keep fueling new medicines to cure and palliate many ailments and some untreatable diseases that still afflict humanity. The ADME/Tox (absorption, distribution, metabolism, excretion/toxicity) properties of candidate drug molecules are key factors that determine the safety, uptake, elimination, metabolic behavior and effectiveness of drug research and development. The predictive technique of ADME/Tox drastically reduces the fraction of pharmaceutics-related failure in the early stages of drug development. Driven by the expectation of accelerated timelines, reduced costs and the potential to reveal hidden insights from vast datasets, artificial intelligence techniques such as Graphormer are showing increasing promise and usefulness to perform custom models for molecule modeling tasks. However, Graphormer and other transformer-based models do not consider the molecular fingerprint, as well as the physicochemicals that have been proved effective in traditional computational drug research. Here, we propose an enhanced model based on Graphormer which uses a tree model that fully integrates some known information and achieves better prediction and interpretability. More importantly, the model achieves new state-of-the-art results on ADME/Tox properties prediction benchmarks, surpassing several challenging models. Experimental results demonstrate an average SMAPE (Symmetric Mean Absolute Percentage Error) of 18.9 and a PCC (Pearson Correlation Coefficient) of 0.86 on ADME/Tox prediction test sets. These findings highlight the efficacy of our approach and its potential to enhance drug discovery processes. By leveraging the strengths of Graphormer and incorporating additional molecular descriptors, our model offers improved predictive capabilities, thus contributing to the advancement of ADME/Tox prediction in drug development. The integration of various information sources further enables better interpretability, aiding researchers in understanding the underlying factors influencing the predictions. Overall, our work demonstrates the potential of our enhanced model to expedite drug discovery, reduce costs, and enhance the success rate of our pharmaceutical development efforts.

Chemformer: a pre-trained transformer for computational chemistry

Do Chemformers Dream of Organic Matter? Evaluating a Transformer Model for Multistep Retrosynthesis

3D-Transformer: Molecular Representation with Transformer in 3D Space

One Transformer Can Understand Both 2D & 3D Molecular Data

Chemical transformer compression for accelerating both training and inference of molecular modeling

Application of Transformers in Cheminformatics

Harnessing Data Augmentation and Normalization Preprocessing to Improve the Performance of Chemical Reaction Predictions of Data-Driven Model

Molecular Transformer - A Model for Uncertainty-Calibrated Chemical Reaction Prediction

SMILES Transformer: Pre-trained Molecular Fingerprint for Low Data Drug Discovery

Transformer Performance for Chemical Reactions: Analysis of Different Predictive and Evaluation Scenarios

Molecular Geometry-aware Transformer for accurate 3D Atomic System modeling

GraphXForm: Graph transformer for computer-aided molecular design with application to extraction

RetroPrime: A Chemistry-Inspired and Transformer-based Method for Retrosynthesis Predictions

A multi-modal pre-training transformer for universal transfer learning in metal–organic frameworks

Pretraining Graph Transformer for Molecular Representation with Fusion of Multimodal Information

Exhaustive local chemical space exploration using a transformer model

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

Reagent prediction with a molecular transformer improves reaction data quality

Dual-view Molecular Pre-training

An Evolved Transformer Model for ADME/Tox Prediction

C5T5: Controllable Generation of Organic Molecules with Transformers