Abstract:In this paper, we explore how to use topological tools to compare dimension reduction methods. We first make a brief overview of some of the methods often used dimension reduction such as Isometric Feature Mapping, Laplacian Eigenmaps, Fast Independent Component Analysis, Kernel Ridge Regression, t-distributed Stochastic Neighbor Embedding. We then give a brief overview of some topological notions used in topological data analysis, such as, barcodes, persistent homology, and Wasserstein distance. Theoretically, these methods applied on a data set can be interpreted differently. From EEG data embedded into a manifold of high dimension, we apply these methods and we compare them across persistent homologies of dimension 0, 1, and 2, that is, across connected components, tunnels and holes, shells around voids or cavities. We find that from three dimension clouds of points, it is not clear how distinct from each other the methods are, but Wasserstein and Bottleneck distances, topological tests of hypothesis, and various methods show that the methods qualitatively and significantly differ across homologies.
What problem does this paper attempt to address?
The problem this paper attempts to address is how to use topological tools to compare different dimensionality reduction methods. Specifically, the authors first review some commonly used dimensionality reduction methods, such as Isometric Feature Mapping, Laplacian Eigenmaps, Fast Independent Component Analysis, Kernel Ridge Regression, and t-distributed Stochastic Neighbor Embedding. Then, the authors introduce some basic concepts in topological data analysis, such as barcodes, persistent homology, and Wasserstein distance.
The authors use electroencephalogram (EEG) data as the experimental subject, applying these dimensionality reduction methods to the high-dimensional manifold of EEG data, and comparing these methods through 0, 1, and 2-dimensional persistent homology (i.e., connected components, holes, and shells around cavities). The study finds that, from the perspective of 3-dimensional point clouds, the differences between these methods are not obvious, but through Wasserstein distance, bottleneck distance, topological hypothesis testing, and various methods, the differences in homology between these methods can be qualitatively and significantly observed.
Overall, this paper aims to evaluate the effectiveness and differences of different dimensionality reduction methods in handling high-dimensional EEG data through topological data analysis methods.