Abstract:Analyzing high-dimensional data and finding hidden patterns is a difficult problem and has attracted numerous research efforts. Automated methods can be useful to some extent but bringing the data analyst into the loop via interactive visual tools can help the discovery process tremendously. An inherent problem in this effort is that humans lack the mental capacity to truly understand spaces exceeding three spatial dimensions. To keep within this limitation, we describe a framework that decomposes a high-dimensional data space into a continuum of generalized 3D subspaces. Analysts can then explore these 3D subspaces individually via the familiar trackball interface, but using additional facilities to smoothly transition to adjacent subspaces for expanded space comprehension. Since the number of such subspaces suffers from combinatorial explosion, we provide a set of data-driven subspace selection and navigation tools which can guide users to interesting subspaces and views. A subspace trail map allows users to manage the explored subspaces, and also helps them navigate within and across any higher-dimensional subspaces identified by clustering. Both trackball and trail map are each embedded into a word cloud of attribute labels, sized according to the relevance of the associated data dimensions in the currently selected subspace. Finally, a view gallery helps users keep their bearings and return to interesting subspaces and views. We demonstrate our system via several use cases in a diverse set of application areas, such as cluster analysis and refinement, information discovery, and supervised training of classifiers.

Dynamic High Dimensional Data Mapping for Efficient Similarity Query Processing

Generalized multidimensional data mapping and query processing

Matching query processing in high-dimensional space.

Materialization and Decomposition of Dataspaces for Efficient Search

A Mapping Based Approach for Multidimensional Data Indexing

An Effective High-Performance Multiway Spatial Join Algorithm with Spark

Novel High-Dimensional Indexing Structure Based on Dual-Distance Metric

Indexing High-Dimensional Data in Dual Distance Spaces

Scalable Top-K Spatial Keyword Search

A Survey On Accessing Dataspaces

Efficient Column-Oriented Processing for Mutual Subspace Skyline Queries.

Dynamic Dimension Wrapping (DDW) Algorithm: A Novel Approach for Efficient Cross-Dimensional Search in Dynamic Multidimensional Spaces

MUD: Mapping-based query processing for high-dimensional uncertain data

Indexing high-dimensional data for efficient in-memory similarity search

Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach

HBaseSpatial: A Scalable Spatial Data Storage Based on HBase

Contorting High Dimensional Data for Efficient Main Memory KNN Processing

DForest: A Minimal Dimensionality-Aware Indexing for High-Dimensional Exact Similarity Search

Algorithm of Spatial Query in GIS Based on Dynamic Hash

PROM: Efficient matching query processing on high-dimensional data

The Subspace Voyager: Exploring High-Dimensional Data along a Continuum of Salient 3D Subspaces