JECI++: A Modified Joint Knowledge Graph Embedding Model for Concepts and Instances

Peng Wang,Jing Zhou
DOI: https://doi.org/10.1016/j.bdr.2020.100160
IF: 3.3
2021-05-01
Big Data Research
Abstract:<p>Concepts and instances are important parts in knowledge graphs, but most knowledge graph embedding models treat them as entities equally, that leads to inaccurate embeddings of concepts and instances. Aiming to solve this problem, we propose a novel knowledge graph embedding model called JECI++ to jointly embed concepts and instances. First, JECI++ simplifies hierarchical concepts based on <em>subClassOf</em> relation and <em>instanceOf</em> relation, then re-links instances to the simplified concepts as new <em>instanceOf</em> triples. Consequently, an instance can be obtained by its neighbor instances and its belonging simplified concepts. Second, circular convolution is utilized to locate an instance in the embedding space, based on neighbor instances and simplified concepts. Finally, simplified concepts and instances are jointly embedded by the embeddings learner with CBOW (Continuous Bag-of-Words) and Skip-Gram strategies. Especially, JECI++ can alleviate the problem of complex relations by incorporating neighbor information of instances. JECI++ is evaluated by link prediction and triple classification on real world datasets. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.</p>
computer science, information systems, artificial intelligence, theory & methods
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper attempts to address the issue of inaccuracies caused by treating concepts and instances equally in embedding representations within knowledge graphs. Most existing knowledge graph embedding models do not distinguish between concepts and instances, treating them as entities equally, which leads to inaccurate embedding representations of concepts and instances. ### Specific Problem Description 1. **Distinction between Concepts and Instances**: - Concepts are usually organized in the form of ontologies within knowledge graphs, forming a hierarchy. - Instances are specific objects, each corresponding to a unique physical object, and may belong to one or more concepts. - Existing embedding models often ignore the distinction between concepts and instances, resulting in imprecise embedding representations. 2. **Handling Complex Relationships**: - Relationships in knowledge graphs can be complex, such as one-to-many, many-to-one, or many-to-many. - Existing embedding models face difficulties in handling these complex relationships, leading to inaccurate embedding representations. 3. **Aggregation of Similar Instances**: - In the semantic space, instances with similar relationships or concepts tend to cluster together, reducing the distinctiveness of the embedding representations. ### Solution To address the above issues, the authors propose a new knowledge graph embedding model—JECI++. This model improves the embedding representations of concepts and instances through the following methods: 1. **Hierarchy Tree Generator**: - Maps concepts in the knowledge graph onto a tree and simplifies concepts with subclass relationships. - Re-links instances through the simplified concepts to generate new instance relationship triples. 2. **Context Vector Generator**: - Generates context vectors using the neighbor information of the target instance. - Combines context vectors with simplified concept vectors through a recurrent convolution function to predict the most matching instance. 3. **Embedding Learner**: - Jointly learns the embedding representations of concepts and instances using CBOW (Continuous Bag of Words) and Skip-Gram strategies. - Enhances the distinctiveness of the embedding representations through negative sampling and margin loss functions. ### Experimental Results Experimental results show that JECI++ outperforms existing state-of-the-art models in link prediction and triple classification tasks, particularly excelling in handling complex relationships and improving the distinctiveness of instance embedding representations.