Construction of Therapy-Disease Knowledge Graph (TDKG) Based on Entity Relationship Extraction

Haohua Wang,Aiyu Wang,Fangfang Su,Honghai Feng,Yanyan Chen
DOI: https://doi.org/10.1109/aemcse51986.2021.00173
2021-03-01
Abstract:The knowledge graph of treatment-disease relationship can be a benefit not only to understand, inquire, and learn the relations between treatments and diseases from a macro level, but also to obtain the differences between treatments to the same disease through the comparison of different treatments; with the aid of commonalities of some treatments, a treatment to a disease that has not been discovered may be recognized; and with the aid of the commonalities of some diseases, a treatment to a disease that has not been discovered may be recognized too. At present, the therapy-disease knowledge graph has become the focus of medical informatics fields. International and domestic Internet giants have deployed some products in this area. The basic method of knowledge graph research is relation extraction in natural language processing. This paper uses semantic technology to discover several sentence patterns expressed in the therapy-disease sentence, and uses a method similar to BERT, that is, using the known part of the sentence to learn the unknown part of the sentence. An unsupervised learning algorithm is designed that extracts the relationship between therapies and diseases. By extracting the relationship between diseases and therapies entities from the abstracts of 300844 literature on diseases’ treatments, 203,238 entity relationships were identified, of which 180,675 entity relationships were valid. The results show that the sentence patterns expressing entity relationships in the literature on disease-treatment are relatively fixed. The entity relationship extraction is performed by dividing the sentence structure by subject, predicate and object. And the accuracy of extraction is 88.98%, and the recall rate is 60.05%.
What problem does this paper attempt to address?