Abstract:Knowledge graphs, which consist of entities and their relations, have become a popular way to store structured knowledge. Knowledge graph embedding (KGE), which derives a representation for each entity and relation, has been widely used to capture the semantics of the information in the knowledge graphs, and has demonstrated great success in many downstream applications, such as the extraction of similar entities in response to a query entity. However, existing KGE methods cannot work well on emerging knowledge graphs that are large-scale due to the constraints in storage and inference efficiency. In this paper, we propose a lightweight KGE model, LightKG, which significantly reduces storage as well as running time needed for inference. Instead of storing a continuous vector for every entity, LightKG only needs to store a few codebooks, each of which contains some codewords that correspond to the representatives among the embeddings, and the indices that correspond to the codeword selections for entities. Hence LightKG can achieve highly efficient storage. The efficiency of the downstream querying process can be significantly boosted too with the proposed LightKG model as the relevance score between the query and an entity can be efficiently calculated via a quick look-up in a table that contains the scores between the query and codewords. The storage and inference efficiency of LightKG is achieved by its novel design. LightKG is an end-to-end framework that automatically infers codebooks and codewords and generates an approximated embedding for each entity. A residual module is included in LightKG to induce the diversity among codebooks, and a continuous function is adopted to approximate codeword selection, which is non-differential. In addition, to further improve the performance of KGE, we propose a novel dynamic negative sampling method based on quantization, which can be applied to the proposed LightKG or other KGE methods. We conduct extensive experiments on five public datasets. The experiments show that LightKG is search and memory efficient with high approximate search accuracy. Also, the dynamic negative sampling can dramatically improve model performance with over 19% improvement on average.

A Lightweight Knowledge Graph Embedding Framework for Efficient Inference and Storage

Efficient Non-Sampling Knowledge Graph Embedding

Efficiently Embedding Dynamic Knowledge Graphs

Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis

Meta-Knowledge Transfer for Inductive Knowledge Graph Embedding

SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance

SEPAKE: a structure-enhanced and position-aware knowledge embedding framework for knowledge graph completion

DistilE: Distiling Knowledge Graph Embeddings for Faster and Cheaper Reasoning

CogKGE: A Knowledge Graph Embedding Toolkit and Benchmark for Representing Multi-source and Heterogeneous Knowledge

Low-Dimensional Federated Knowledge Graph Embedding via Knowledge Distillation

Federated Knowledge Graph Completion Via Embedding-Contrastive Learning

Expeditious Generation of Knowledge Graph Embeddings

PIE: a Parameter and Inference Efficient Solution for Large Scale Knowledge Graph Embedding Reasoning

Knowledge Graph Embedding: An Overview

RelaGraph: Improving Embedding on Small-Scale Sparse Knowledge Graphs by Neighborhood Relations

SEEK: Segmented Embedding of Knowledge Graphs

DGL-KE: Training Knowledge Graph Embeddings at Scale

KGDM: A Diffusion Model to Capture Multiple Relation Semantics for Knowledge Graph Embedding

KGE-CL: Contrastive Learning of Tensor Decomposition Based Knowledge Graph Embeddings.

Fast and Continual Knowledge Graph Embedding via Incremental LoRA

Text-Graph Enhanced Knowledge Graph Representation Learning.