Scalable Graph Representation Learning Via Locality-Sensitive Hashing

Xiusi Chen,Jyun-Yu Jiang,Wei Wang
DOI: https://doi.org/10.1145/3511808.3557689
2022-01-01
Abstract:A massive amount of research on graph representation learning has been carried out to learn dense features as graph embedding for information networks, thereby capturing the semantics in complex networks and benefiting a variety of downstream tasks. Most of the existing studies focus on structural properties, such as distances and neighborhood proximity between nodes. However, real-world information networks are dominated by the low-degree nodes because they are not only sparse but also subject to the Power law form. Due to the sparsity, proximity-based methods are incapable of deriving satisfactory representations for these tail nodes. To address this challenge, we propose a novel approach, Content-Preserving Locality-Sensitive Hashing (CP-LSH), by incorporating the content information for representation learning. Specifically, we aim at preserving LSH-based content similarity between nodes to leverage the knowledge from popular nodes to long-tail nodes. We also propose a novel hashing trick to reduce the redundant space consumption so that CP-LSH is capable of tackling industry-scale data. Extensive offline experiments have been conducted on three large-scale public datasets. We also deploy CP-LSH to real-world recommendation systems in one of the largest e-commerce platforms for online experiments. Experimental results demonstrate that CP-LSH outperforms competitive baseline methods in node classification and link prediction tasks. Besides, the results of online experiments also indicate that CP-LSH is practical and robust for real-world production systems.
What problem does this paper attempt to address?