Stable Similarity Comparison of Persistent Homology Groups

Jiaxing He,Bingzhe Hou,Tieru Wu,Yang Cao
2024-11-15
Abstract:Classification in the sense of similarity is an important issue. In this paper, we study similarity classification in Topological Data Analysis. We define a pseudometric $d_{S}^{(p)}$ to measure the distance between barcodes generated by persistent homology groups of topological spaces, and we provide that our pseudometric $d_{S}^{(2)}$ is a similarity invariant. Thereby, we establish a connection between Operator Theory and Topological Data Analysis. We give the calculation formula of the pseudometric $d_{S}^{(2)}$ $(d_{S}^{(1)})$ by arranging all eigenvalues of matrices determined by barcodes in descending order to get the infimum over all matchings. Since conformal linear transformation is one representative type of similarity transformations, we construct comparative experiments on both synthetic datasets and waves from an online platform to demonstrate that our pseudometric $d_{S}^{(2)}$ $(d_{S}^{(1)})$ is stable under conformal linear transformations, whereas the bottleneck and Wasserstein distances are not. In particular, our pseudometric on waves is only related to the waveform but is independent on the frequency and amplitude. Furthermore, the computation time for $d_{S}^{(2)}$ $(d_{S}^{(1)})$ is significantly less than the computation time for bottleneck distance and is comparable to the computation time for accelerated Wasserstein distance between barcodes.
Algebraic Topology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to classify similar objects in Topological Data Analysis (TDA). Specifically, the authors introduce a new pseudo - metric \(d^{(p)}_S\) to measure the distance between barcodes generated by the persistent homology groups of topological spaces. This new pseudo - metric can remain invariant in the sense of similarity transformation, that is, it can group similar objects into the same class when dealing with them, while other existing pseudo - metrics such as the bottleneck distance and the Wasserstein distance cannot do this. ### Main contributions of the paper 1. **Introduction of the similarity pseudo - metric \(d^{(p)}_S\)**: - The authors define a pseudo - metric \(d^{(p)}_S\) to compare barcodes generated by the persistent homology groups of topological spaces. - In particular, they prove that \(d^{(2)}_S\) is a similarity invariant. 2. **Provision of calculation formulas**: - The calculation formulas for \(d^{(2)}_S(d^{(1)}_S)\) are given, and the minimum distance of matching is calculated by arranging the eigenvalues of the matrix in descending order. 3. **Experimental verification**: - Through experiments on synthetic data sets, the stability of their pseudo - metric under conformal linear transformation is verified. - Further, through experiments on piano and tuning fork waveforms, it is shown that their pseudo - metric is only related to the waveform itself and has nothing to do with frequency and amplitude. 4. **Computational efficiency**: - The calculation time of their pseudo - metric \(d^{(2)}_S(d^{(1)}_S)\) is significantly less than that of the bottleneck distance and is comparable to that of the accelerated Wasserstein distance. ### Paper background - **Persistent homology**: Persistent homology is an important tool in TDA, which is used to analyze topological features at different scales. - **Barcodes and persistent modules**: Barcodes and persistent modules are representation forms of persistent homology groups, which are used to describe the birth and death times of topological features. - **Existing pseudo - metrics**: Existing pseudo - metrics such as the bottleneck distance and the Wasserstein distance are mainly used to measure the distance between barcodes or persistent modules in the sense of isomorphism, but cannot handle the classification problem of similar objects. ### Experimental results - **Synthetic data sets**: The experimental results show that the new pseudo - metrics \(d^{(2)}_S\) and \(d^{(1)}_S\) remain stable under conformal linear transformation, while the bottleneck distance and the Wasserstein distance do not have this stability. - **Waveform data**: For the waveforms of piano and tuning fork, the new pseudo - metric can distinguish different instruments and is only related to the waveform itself and is not affected by frequency and amplitude. ### Conclusion This paper proposes a new pseudo - metric \(d^{(p)}_S\) that can effectively classify objects in the sense of similarity transformation. This method is not only of great theoretical significance but also shows superior performance in practical applications.