DRML-Ensemble: drug repurposing method based on feature construction of multi-layer ensemble

Mengfei Zhang,Hongjian He,Jiang Xie,Qing Nie
DOI: https://doi.org/10.1007/s00894-024-06087-9
2024-07-31
Abstract:Context: Computational drug repurposing methods have been continuously developed in recent years to alleviate the high costs associated with drug development. As drug targets or the products of disease-related genes, proteins play an important role in drug repurposing. Although the potential has been demonstrated, heterogeneous graphs with proteins as independent nodes have yet to be studied, where extracting high-quality protein features from heterogeneous graphs poses a significant challenge. A novel drug repurposing model based on the feature construction of multi-layer ensemble (DRML-Ensemble) is proposed in this study. The performance of DRML-Ensemble, as evaluated on publicly available datasets, achieves an AUPR value of 0.93 and an AUROC value of 0.92, surpassing those of existing state-of-the-art methods. Additionally, DRML-Ensemble demonstrates its notable ability for drug repurposing in Alzheimer's disease. Methods: DRML-Ensemble is primarily composed of multiple layers of heterogeneous graph feature construction (HGFC). Each HGFC can extract protein features by leveraging the relationships between drugs, diseases, and proteins. These protein features are then utilized in subsequent layers to build drug and disease features, facilitating drug repurposing. By stacking multiple layers, optimal protein features can be obtained from the heterogeneous graph, consequently improving the accuracy of drug repurposing. However, an excessive· stacking of layers usually affect the model's training process, for example, causing problems such as overfitting; a multi-layer ensemble prediction module is designed to further improve the model's performance.
What problem does this paper attempt to address?