Escaping the drug-bias trap: using debiasing design to improve interpretability and generalization of drug-target interaction prediction

Pei-Dong Zhang,Jianzhu Ma,Ting Chen
DOI: https://doi.org/10.1101/2024.09.12.612771
2024-09-15
Abstract:Considering the high cost associated with determining reaction affinities through in-vitro experiments, virtual screening of potential drugs bound with specific protein pockets from vast compounds is critical in AI-assisted drug discovery. Deep-leaning approaches have been proposed for Drug-Target Interaction (DTI) prediction. However, they have shown overestimated accuracy because of the drug-bias trap, a challenge that results from excessive reliance on the drug branch in the traditional drug-protein dual-branch network approach. This casts doubt on the interpretability and generalizability of existing Drug-Target Interaction (DTI) models. Therefore, we introduce UdanDTI, an innovative deep-learning architecture designed specifically for predicting drug-protein interactions. UdanDTI applies an unbalanced dual-branch system and an attentive aggregation module to enhance interpretability from a biological perspective. Across various public datasets, UdanDTI demonstrates outstanding performance, outperforming state-of-the-art models under in-domain, cross-domain, and structural interpretability settings. Notably, it demonstrates exceptional accuracy in predicting drug responses of two crucial subgroups of Epidermal Growth Factor Receptor (EGFR) mutations associated with non-small cell lung cancer, consistent with experimental results. Meanwhile, UdanDTI could complement the advanced molecular docking software DiffDock. The codes and datasets of UdanDTI are available at .
Bioinformatics
What problem does this paper attempt to address?
This paper attempts to address the issue in drug-target interaction (DTI) prediction where existing deep learning methods suffer from a lack of interpretability and generalization ability due to the "drug-bias trap." Specifically, traditional dual-branch network models overly rely on the drug branch when handling DTI, neglecting the complex spatial characteristics of the protein pocket, leading to poor performance in practical applications. To solve this problem, the authors propose an innovative deep learning architecture called UdanDTI, which improves the model's interpretability and generalization ability through an imbalanced dual-branch system and an attention aggregation module. UdanDTI outperforms existing state-of-the-art models on multiple public datasets and excels in predicting drug responses for two important epidermal growth factor receptor (EGFR) mutation subgroups. Additionally, UdanDTI can serve as an auxiliary tool for molecular docking software, further enhancing its application value in virtual screening.