Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework

Xiangxiang Zeng,Hongxin Xiang,Linhui Yu,Jianmin Wang,Kenli Li,Ruth Nussinov,Feixiong Cheng
DOI: https://doi.org/10.1038/s42256-022-00557-6
IF: 23.8
2022-11-18
Nature Machine Intelligence
Abstract:The clinical efficacy and safety of a drug is determined by its molecular properties and targets in humans. However, proteome-wide evaluation of all compounds in humans, or even animal models, is challenging. In this study, we present an unsupervised pretraining deep learning framework, named ImageMol, pretrained on 10 million unlabelled drug-like, bioactive molecules, to predict molecular targets of candidate compounds. The ImageMol framework is designed to pretrain chemical representations from unlabelled molecular images on the basis of local and global structural characteristics of molecules from pixels. We demonstrate high performance of ImageMol in evaluation of molecular properties (that is, the drug's metabolism, brain penetration and toxicity) and molecular target profiles (that is, beta-secretase enzyme and kinases) across 51 benchmark datasets. ImageMol shows high accuracy in identifying anti-SARS-CoV-2 molecules across 13 high-throughput experimental datasets from the National Center for Advancing Translational Sciences. Via ImageMol, we identified candidate clinical 3C-like protease inhibitors for potential treatment of COVID-19.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?