Abstract:A bstract Among virtual screening methods that have been developed to facilitate the drug discovery process, chemogenomics presents the particularity to tackle the question of predicting ligands for proteins, at at scales both in the protein and chemical spaces. Therefore, in addition to to predict drug candidates for a given therapeutic protein target, like more classical ligand-based or receptor-based methods do, chemogenomics can also predict off-targets at the proteome level, and therefore, identify potential side-effects or drug repositioning opportunities. In this study, we study and compare machine-learning and deep learning approaches for chemogenomics, that are applicable to screen large sets of compounds against large sets of druggable proteins. State-of-the-art drug chemogenomics methods rely on expert-based chemical and protein descriptors or similarity measures. The recent development of deep learning approaches enabled to design algorithms that learn numerical abstract representations of molecular graphs and protein sequences in an end-to-end fashion, i.e., so that the learnt features optimise the objective function of the drug-target interaction prediction task. In this paper, we address drug-target interaction prediction at the druggable proteome-level, with what we define as the chemogenomic neuron network. This network consists of a feed-forward neuron network taking as input the combination of molecular and protein representations learnt by molecular graph and protein sequence encoders. We first propose a standard formulation of this chemogenomic neuron network. Then, we compare the performances of the standard chemogenomic network to reference deep learning or shallow (machine-learning without deep learning) methods. In particular, we show that such a representation learning approach is competitive with state-of-the-art chemogenomics with shallow methods, but not ultimately superior. We evaluate the most promising neuron network architectures and data augmentation techniques, such as multi-view and transfer learning, to improve the prediction performance of the chemogenomic network. Our results shed new insights on the design of chemogenomics approaches based on representation learning algorithms. Most importantly, we conclude from our observations that a promising research direction is to integrate heterogeneous sources of data such as various bioactivity datasets, or independently, multiple molecule and protein attribute views, instead of focusing on sophisticated, yet intuitively relevant, encoder’s neuron network architecture.

Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction

Deep Learning for Green Chemistry: An AI-Enabled Pathway for Biodegradability Prediction and Organic Material Discovery

Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

Bioplastic Design using Multitask Deep Neural Networks

A Framework for Accurate Prediction of Plastic-Degrading Enzymes using Convolutional Neural Networks

Deep-Feature-Based Approach to Marine Debris Classification

DeepBBBP: High accuracy Blood‐Brain‐Barrier Permeability Prediction with a Mixed Deep Learning Model

Integrating Chemical Language and Molecular Graph in Multimodal Fused Deep Learning for Drug Property Prediction

Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph

A deep learning architecture for metabolic pathway prediction

A Deep Neural Network -- Mechanistic Hybrid Model to Predict Pharmacokinetics in Rat

Evaluation of Activated Sludge Settling Characteristics from Microscopy Images with Deep Convolutional Neural Networks and Transfer Learning

Predictive Multitask Deep Neural Network Models for ADME-Tox Properties: Learning from Large Data Sets

Brain-inspired multimodal approach for effluent quality prediction using wastewater surface images and water quality data

Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning

Optimizing Degradable Plastic Density Prediction: A Coarse-to-Fine Deep Neural Network Approach

SB-Net: Synergizing CNN and LSTM networks for uncovering retrosynthetic pathways in organic synthesis

Step Change Improvement in ADMET Prediction with PotentialNet Deep Featurization

Application of deep learning for predicting the treatment performance of real municipal wastewater based on one-year operation of two anaerobic membrane bioreactors

Evaluation of network architecture and data augmentation methods for deep learning in chemogenomics

Deep Multimodal Network for Multi-Label Classification.