Interpretable Embedding and Visualization of Compressed Data

Nikolaos M. Freris,Ahmad Ajalloeian,Michalis Vlachos
DOI: https://doi.org/10.1145/3537901
IF: 4.157
2023-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:Traditional embedding methodologies, also known as dimensionality reduction techniques, assume the availability of exact pairwise distances between the high-dimensional objects that will be embedded in a lower dimensionality. In this article, we propose an embedding that overcomes this limitation and can operate on pairwise distances that are represented as a range of lower and upper bounds. Such bounds are typically estimated when objects are compressed in a lossy manner, so our approach is highly applicable in the case of big compressed datasets. Our methodology can preserve multiple aspects of the original data relationships: distances, correlations, and object scores/ranks, whereas existing techniques typically preserve only distances. Comparative experiments with prevalent embedding methodologies (ISOMAP, t-SNE, MDS, UMAP) illustrate that our approach can provide fidelitous preservation of multiple object relationships, even in the presence of inexact distance information. Our visualization method is also easily interpretable.
What problem does this paper attempt to address?