Abstract:Deep learning-based drug response prediction (DRP) methods can accelerate the drug discovery process and reduce R\&D costs. Although the mainstream methods achieve high accuracy in predicting response regression values, the regression-aware representations of these methods are fragmented and fail to capture the continuity of the sample order. This phenomenon leads to models optimized to sub-optimal solution spaces, reducing generalization ability and may result in significant wasted costs in the drug discovery phase. In this paper, we propose \MN, a contrastive learning framework with natural language supervision for the DRP. The \MN~converts regression labels into text, which is merged with the captions text of the drug response as a second modality of the samples compared to the traditional modalities (graph, sequence). In each batch, two modalities of one sample are considered positive pairs and the other pairs are considered negative pairs. At the same time, in order to enhance the continuous representation capability of the numerical text, a common-sense numerical knowledge graph is introduced. We validated several hundred thousand samples from the Genomics of Drug Sensitivity in Cancer dataset, observing the average improvement of the DRP method ranges from 7.8\% to 31.4\% with the application of our framework. The experiments prove that the \MN~effectively constrains the samples to a continuous distribution in the representation space, and achieves impressive prediction performance with only a few epochs of fine-tuning after pre-training. The code is available at: \url{<a class="link-external link-https" href="https://gitee.com/xiaoyibang/clipdrug.git" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the deficiency of representation learning in existing Drug - Reaction Prediction (DRP) methods in regression tasks, especially the poor generalization ability under zero - shot learning conditions. Specifically, although traditional DRP methods perform well in predicting the reaction results of drugs on cell lines, their performance drops significantly when dealing with unseen compounds. This is because traditional methods cannot effectively capture the inherent order of continuous numerical values, leading the model to be optimized to a sub - optimal solution space, thus affecting the generalization ability of the model and the cost - effectiveness in practical applications. To solve these problems, the authors propose CLDR (Contrastive Learning Drug Response Models from Natural Language Supervision), which is a contrastive learning framework combined with natural language supervision. The main contributions of CLDR include: 1. **Constructing the connection between drug - reaction data and annotated texts**: By converting continuous numerical labels into natural language texts and using them together with the description texts of the drug - reaction process as the second modality of the sample, the model's understanding and representation ability of continuous numerical values are enhanced. 2. **Introducing the common - sense numerical knowledge graph**: Based on the definition of ordinal numbers, a common - sense numerical knowledge graph (CN - KG) is constructed to enhance the model's perception ability of numerical continuity. 3. **Improving the generalization performance of zero - shot learning**: Through the contrastive learning strategies in the pre - training and fine - tuning stages, CLDR can effectively map samples to the representation space of continuous distribution, improving the prediction performance of the model under zero - shot learning conditions. The experimental results show that CLDR can significantly improve the prediction performance of the model on multiple DRP methods. Especially under zero - shot learning conditions, the average improvement range varies from 7.8% to 31.4%. This proves the effectiveness of CLDR and can significantly improve the pre - clinical screening efficiency and success rate in the drug discovery process.

CLDR: Contrastive Learning Drug Response Models from Natural Language Supervision

MMDRP: drug response prediction and biomarker discovery using multi-modal deep learning

GPDRP: a multimodal framework for drug response prediction with graph transformer

Zero-shot Learning of Drug Response Prediction for Preclinical Drug Screening

Understanding the Sources of Performance in Deep Drug Response Models Reveals Insights and Improvements

Efficient modelling of channel maps with correlated shadow fading in mobile radio systems

LABORATORY.

Regressor-free Molecule Generation to Support Drug Response Prediction

Comparison between laparoscopic sterilization and tubal ligation.

Data Imbalance in Drug Response Prediction – Multi-Objective Optimization Approach in Deep Learning Setting

Dynamic Interaction Learning and Multimodal Representation for Drug Response Prediction

TransCDR: a deep learning model for enhancing the generalizability of cancer drug response prediction through transfer learning and multimodal data fusion for drug representation

Encephalographic Cortical Atrophy

DrugCLIP: Contrastive Drug-Disease Interaction For Drug Repurposing

GCFMCL: predicting miRNA-drug sensitivity using graph collaborative filtering and multi-view contrastive learning

A Novel Descriptor and Molecular Graph-Based Bimodal Contrastive Learning Framework for Drug Molecular Property Prediction.

GraphCDR: a graph neural network method with contrastive learning for cancer drug response prediction

MLRDA: A Multi-Task Semi-Supervised Learning Framework for Drug-Drug Interaction Prediction

A novel heterogeneous network-based method for drug response prediction in cancer cell lines

Multimodal contrastive representation learning for drug-target binding affinity prediction

SADR: Self-supervised Graph Learning with Adaptive Denoising for Drug Repositioning