Entropic Wasserstein Component Analysis

Antoine Collas,Titouan Vayer,Rémi Flamary,Arnaud Breloy
DOI: https://doi.org/10.48550/arXiv.2303.05119
IF: 5.414
2023-03-09
Machine Learning
Abstract:Dimension reduction (DR) methods provide systematic approaches for analyzing high-dimensional data. A key requirement for DR is to incorporate global dependencies among original and embedded samples while preserving clusters in the embedding space. To achieve this, we combine the principles of optimal transport (OT) and principal component analysis (PCA). Our method seeks the best linear subspace that minimizes reconstruction error using entropic OT, which naturally encodes the neighborhood information of the samples. From an algorithmic standpoint, we propose an efficient block-majorization-minimization solver over the Stiefel manifold. Our experimental results demonstrate that our approach can effectively preserve high-dimensional clusters, leading to more interpretable and effective embeddings. Python code of the algorithms and experiments is available online.
What problem does this paper attempt to address?