SSF-DDI: a deep learning method utilizing drug sequence and substructure features for drug–drug interaction prediction

Jing Zhu,Chao Che,Hao Jiang,Jian Xu,Jiajun Yin,Zhaoqian Zhong
DOI: https://doi.org/10.1186/s12859-024-05654-4
IF: 3.307
2024-01-25
BMC Bioinformatics
Abstract:Drug–drug interactions (DDI) are prevalent in combination therapy, necessitating the importance of identifying and predicting potential DDI. While various artificial intelligence methods can predict and identify potential DDI, they often overlook the sequence information of drug molecules and fail to comprehensively consider the contribution of molecular substructures to DDI.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to predict drug - drug interactions (DDI) in drug combination therapies. Specifically, existing methods often ignore the sequence information of drug molecules or fail to comprehensively consider the influence of molecular sub - structures on DDI when predicting potential DDI. This has led to limitations in the prediction accuracy of existing models, especially when dealing with unknown drugs. To solve these problems, the authors propose a new deep - learning - based model, SSF - DDI (Sequence and Substructure Features for Drug - Drug Interaction prediction), which combines the sequence features and sub - structure features of drug molecules to improve the accuracy and comprehensiveness of DDI prediction. ### Main contributions 1. **Proposed a new DDI prediction model, SSF - DDI**: This model integrates the sequence features and structural features of drug molecules. By introducing the topological properties and sequence features of drug molecules, it captures a wider range of feature information, thereby achieving a more accurate and comprehensive representation of drug molecule features. 2. **Introduced a new drug sub - structure graph feature encoder (SGFE)**: This encoder can effectively extract drug atom and molecular structure features. 3. **Conducted experiments in the transductive setting and the inductive setting**: The experimental results show that SSF - DDI significantly outperforms other methods on multiple real - world datasets. Especially in the inductive setting, SSF - DDI performs particularly well in predicting the DDI of new drugs, with an accuracy improvement of 5.67%. ### Method overview - **Drug molecule sequence feature extraction module**: Use a convolutional neural network (CNN) to extract the sequence features of drug molecules, and determine the interaction score between the two drug sequence features through a mixed attention mechanism (MixAttention). - **Sub - structure feature extraction module**: Construct a new drug sub - structure graph feature encoder (SGFE), use a directed message - passing neural network (D - MPNN) to extract sub - structure features, and then generate a feature vector containing sub - structure and topological information through a multi - layer graph attention network (GAT) and self - attention graph pooling (SAGPooling). - **Prediction module**: Fuse the extracted sequence features and sub - structure features and input them into a fully - connected layer for final drug relationship prediction. ### Experimental results - **Experiments were conducted on the DrugBank and Twosides datasets**: The results show that the accuracy of SSF - DDI in the transductive setting reaches 96.45%, with a relative accuracy improvement of 0.36%; in the inductive setting, the accuracy of SSF - DDI in predicting the DDI of new drugs reaches 87.3%, with an accuracy improvement of 5.67%. - **Evaluation metrics**: Six evaluation metrics were used, including accuracy (ACC), area under the ROC curve (AUC), F1 - value (F1), precision (Precision), recall (Recall), and average precision (AP). ### Conclusion SSF - DDI significantly improves the accuracy and robustness of DDI prediction by combining the sequence features and sub - structure features of drug molecules, especially when dealing with unknown drugs. This method has important significance in practical applications and is helpful for guiding drug development and clinical drug use safety.