Extraction Of Semantic Relations From Medical Literature Based On Semantic Predicates And Svm

Xiaoli Zhao,Shaofu Lin,Zhisheng Huang
DOI: https://doi.org/10.1007/978-3-030-01078-2_2
2018-01-01
Abstract:The relationship of biomedical entity is the cornerstone of acquiring biomedical knowledge. It is of great significance to the construction of related databases in the biomedical field and the management of medical literature. How to quickly and accurately extract the required relationships of biomedical entity from massive unstructured literature is an important research. In order to improve accuracy, we use support vector machine (SVM) which is a machine learning algorithm based on feature vectors to extract relationships of entities. We extract the five main relationships in medical literature, including ISA, PART_OF, CAUSES, TREATS and DIAGNOSES. First of all, related topics are used to search medical literature from PubMed database, such as disease-drug, cause-disease. These documents are used as experimental data and then processed to form a corpus. In selection of features, the method of information gain is used to select the influential entities' own features and entities' context features. On this basis, semantic predicates are added as a feature to improve accuracy. The experimental results show that the accuracy of extraction is increased by 5%-10%. In the end, Resource Description Framework (RDF) is used to store extracted relationships from the corresponding documents, and it provides support for the subsequent retrieval of related documents.
What problem does this paper attempt to address?