A Study of Drug-Disease Associations Prediction on Diabetes Literature Knowledge Network

Yunxia Liu,Pin Liang,Jiazhen Linl,Xuan He
DOI: https://doi.org/10.1109/itnec60942.2024.10733133
2024-01-01
Abstract:Unstructured medical texts contain an amount of valuable medical knowledge which can be used for clinical decision support. In this work, we extracted a weighted drug-disease network from high-quality diabetes literature provided by the DiaKG (an Annotated Diabetes Dataset for Medical Knowledge Graph), which is up to now the first diabetes dataset for medical knowledge graph construction worldwide. We have taken a network embedding approach to obtain the low-dimensional dense representation of each node in the drug-disease network and then predicted the drug-disease associations. The results show that a weighted Node2Vec method can infer the relation with a good performance. Among all the operations and classifiers, the performance using the Average operation and Multilayer Perception classifier achieved the optimal AUC of 0.91 along with an F1 score of 0.82. In addition, by cosine similarity calculation of node embeddings, the top-5 closest drug nodes were generated for the diseases “T1DM” and “T2DM”. It is found that these drugs were all relevant to the above diseases and some interesting phenomenons have been observed. This study demonstrates that the generated embeddings were able to represent the feature of the node in the diabetes knowledge network.
What problem does this paper attempt to address?