A Geometry-Aware Algorithm to Learn Hierarchical Embeddings in Hyperbolic Space

Zhangyu Wang,Lantian Xu,Zhifeng Kong,Weilong Wang,Xuyu Peng,Enyang Zheng
2024-07-24
Abstract:Hyperbolic embeddings are a class of representation learning methods that offer competitive performances when data can be abstracted as a tree-like graph. However, in practice, learning hyperbolic embeddings of hierarchical data is difficult due to the different geometry between hyperbolic space and the Euclidean space. To address such difficulties, we first categorize three kinds of illness that harm the performance of the embeddings. Then, we develop a geometry-aware algorithm using a dilation operation and a transitive closure regularization to tackle these illnesses. We empirically validate these techniques and present a theoretical analysis of the mechanism behind the dilation operation. Experiments on synthetic and real-world datasets reveal superior performances of our algorithm.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily addresses the challenges encountered when learning hierarchical embeddings in hyperbolic space, particularly the difficulties arising from the geometric differences between hyperbolic space and Euclidean space. Specifically, the authors first define three types of "illnesses" that affect the quality of embeddings: 1. **Capacity illness**: Occurs when node B is the parent of node B'. 2. **Intra-subtree illness**: Occurs when node B is an ancestor of node B' but not the direct parent. 3. **Inter-subtree illness**: Occurs when the nearest common ancestor C of B and B' is not equal to B. To address these illnesses, the authors propose a geometry-aware algorithm that includes the following two key components: 1. **Dilation Operation**: By adjusting the scale of the embedding structure, each point is moved to a position with sufficient local capacity, increasing the distance between nodes and their nearest neighbors, thereby improving local capacity. 2. **Transitive Closure Regularization**: This includes adding transitive closure edges and a re-weighting strategy. Adding transitive closure edges aims to separate subtrees and reduce their overlap in hyperbolic space, while the re-weighting strategy aims to avoid overfitting by adjusting the weights of transitive closure edges in the early stages of training. Experimental results show that the algorithm outperforms baseline methods on both synthetic and real-world datasets, particularly excelling in handling extremely dense datasets. In summary, the main contribution of this paper is the proposal of an effective solution to the problem of learning hierarchical embeddings in hyperbolic space, and the demonstration of its effectiveness through theoretical analysis and experiments.