Development of a Knowledge Graph Embeddings Model for Pain

Jaya Chaturvedi,Tao Wang,Sumithra Velupillai,Robert Stewart,Angus Roberts
DOI: https://doi.org/10.48550/arXiv.2308.08904
2023-08-17
Abstract:Pain is a complex concept that can interconnect with other concepts such as a disorder that might cause pain, a medication that might relieve pain, and so on. To fully understand the context of pain experienced by either an individual or across a population, we may need to examine all concepts related to pain and the relationships between them. This is especially useful when modeling pain that has been recorded in electronic health records. Knowledge graphs represent concepts and their relations by an interlinked network, enabling semantic and context-based reasoning in a computationally tractable form. These graphs can, however, be too large for efficient computation. Knowledge graph embeddings help to resolve this by representing the graphs in a low-dimensional vector space. These embeddings can then be used in various downstream tasks such as classification and link prediction. The various relations associated with pain which are required to construct such a knowledge graph can be obtained from external medical knowledge bases such as SNOMED CT, a hierarchical systematic nomenclature of medical terms. A knowledge graph built in this way could be further enriched with real-world examples of pain and its relations extracted from electronic health records. This paper describes the construction of such knowledge graph embedding models of pain concepts, extracted from the unstructured text of mental health electronic health records, combined with external knowledge created from relations described in SNOMED CT, and their evaluation on a subject-object link prediction task. The performance of the models was compared with other baseline models.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to develop a Knowledge Graph Embeddings Model in order to better understand and analyze complex concepts related to pain and their relationships, especially for the application in Electronic Health Records (EHRs). Specifically, this research aims to: 1. **Combine structured knowledge and unstructured text data**: By combining the structured knowledge from external medical knowledge bases (such as SNOMED CT) with the unstructured text data extracted from mental health EHRs, a more comprehensive Knowledge Graph Embeddings Model is constructed. 2. **Improve the performance of downstream tasks**: Evaluate the performance of different variants of the Knowledge Graph Embeddings Model in the link prediction task and compare them with the benchmark models to verify the effectiveness of these models in processing pain - related data. 3. **Explore the relationship between pain and mental health**: Use the constructed Knowledge Graph Embeddings Model to further study the relationship between pain and mental health, especially the impact of pain as a comorbid factor on adverse outcomes. ### Research Background Pain is a complex concept, involving multiple aspects, such as diseases that cause pain, drugs that relieve pain, etc. In order to fully understand the pain experienced by an individual or a group, all pain - related concepts and their relationships need to be examined. This is especially important for modeling pain in Electronic Health Records. Knowledge Graphs (KGs) can represent these concepts and their relationships, but large - scale KGs may be computationally inefficient. Knowledge Graph Embeddings (KGEs) solve this problem by representing KGs as low - dimensional vector spaces, so that they can be used in downstream tasks such as classification and link prediction. ### Method Overview This research uses three different variants to construct the Knowledge Graph Embeddings Model: - **Variant 1**: Only use the triples generated from the pain vocabulary. - **Variant 2**: Combine the triples in the pain vocabulary and the pain concepts in the CRIS data. - **Variant 3**: In addition to the data of Variant 2, also include the sentence embeddings of sentences containing pain concepts in the CRIS data. The two main embedding models used are ComplEx and TransE. The ComplEx model uses the tensor decomposition method, while the TransE model is based on the distance function. The experimental results show that Variant 3, which combines EHR data and sentence embeddings, performs best in the link prediction task, especially when using the ComplEx model. ### Conclusion This research shows that the Knowledge Graph Embeddings Model that combines structured knowledge and real - world text data has better performance in processing pain - related data. This provides new tools and methods for future research, helps to better understand the relationship between pain and mental health, and may improve pain management and treatment outcomes for patients.