Fast Gradient Computation for Gromov-Wasserstein Distance

Wei Zhang,Zihao Wang,Jie Fan,Hao Wu,Yong Zhang
2024-04-13
Abstract:The Gromov-Wasserstein distance is a notable extension of optimal transport. In contrast to the classic Wasserstein distance, it solves a quadratic assignment problem that minimizes the pair-wise distance distortion under the transportation of distributions and thus could apply to distributions in different spaces. These properties make Gromov-Wasserstein widely applicable to many fields, such as computer graphics and machine learning. However, the computation of the Gromov-Wasserstein distance and transport plan is expensive. The well-known Entropic Gromov-Wasserstein approach has a cubic complexity since the matrix multiplication operations need to be repeated in computing the gradient of Gromov-Wasserstein loss. This becomes a key bottleneck of the method. Currently, existing methods accelerate the computation focus on sampling and approximation, which leads to low accuracy or incomplete transport plan. In this work, we propose a novel method to accelerate accurate gradient computation by dynamic programming techniques, reducing the complexity from cubic to quadratic. In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity. Furthermore, it can be extended to some variants easily. Extensive experiments validate the efficiency and effectiveness of our method.
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the high complexity problem of Gromov-Wasserstein (GW) distance computation. Specifically: 1. **High computational complexity**: Traditional methods for computing GW distance have high time complexity, especially the Entropic GW method based on entropy regularization, whose gradient computation has a time complexity of \(O(N^3)\), making it very time-consuming on large-scale datasets. 2. **Limitations of existing acceleration methods**: Existing acceleration methods mainly reduce computation through sampling and approximation, but these methods often result in low accuracy or incomplete transport plans. To address these issues, the paper proposes a new method—Fast Gradient Computation for Gromov-Wasserstein (FGC-GW), which uses dynamic programming techniques to reduce the complexity of gradient computation from \(O(N^3)\) to \(O(N^2)\). This method not only improves computational efficiency but also maintains high accuracy and complete transport plans. Additionally, this method can be extended to some GW variants, such as Fused GW (FGW) and Unbalanced GW (UGW). The main contributions of the paper include: - **Low time complexity**: Reducing the complexity of gradient computation from \(O(N^3)\) to \(O(N^2)\). - **High accuracy**: Maintaining the same accuracy and complete transport plans as the original entropy regularization method. - **Wide applicability**: Can be applied to various GW variants, such as FGW and UGW. Through extensive experimental validation, the paper demonstrates the efficiency and accuracy of the FGC-GW method in different application scenarios, including 1D and 2D random distributions, time series alignment, and image alignment tasks.