Abstract:Interactive exploration of multidimensional data sets is challenging because: (1) it is difficult to comprehend patterns in more than three dimensions, and (2) current systems are often a patchwork of graphical and statistical methods leaving many researchers uncertain about how to explore their data in an orderly manner. This dissertation offers a set of principles and a novel rank-by-feature framework that could enable users to better understand multidimensional and multivariate data by systematically studying distributions in one (1D) or two dimensions (2D), and then discovering relationships, clusters, gaps, outliers, and other features. Users of this rank-by-feature framework can view graphical presentations (histograms, boxplots, and scatterplots), and then choose a feature detection criterion to rank ID or 2D axis-parallel projections. By combining information visualization techniques (overview, coordination, and dynamic query) with summaries and statistical methods, users can systematically examine the most important 1D and 2D axis-parallel projections. This research provides a number of valuable contributions: (a) Graphics, Ranking, and Interaction for Discovery (GRID) principles—a set of principles for exploratory analysis of multidimensional data, which are summarized as: (1) study 1D, study 2D, then find features (2) ranking guides insight, statistics confirm. GRID principles help users organize their discovery process in an orderly manner so as to produce more thorough analyses and extract deeper insights in any multidimensional data application. (b) Rank-by-feature framework—a user interface framework based on the GRID principles. Interactive information visualization techniques are combined with statistical methods and data mining algorithms to enable users to orderly examine multidimensional data sets using ID and 2D projections. (c) The design and implementation of the Hierarchical Clustering Explorer (HCE), an information visualization tool available at www.cs.umd.edu/hcil/hce. HCE implements the rank-by-feature framework and supports interactive exploration of hierarchical clustering results to reveal one of the important features—clusters. (d) Validation through case studies and user surveys: Case studies with motivated experts in three research fields and a user survey via emails to a wide range of HCE users demonstrated the efficacy of HCE and the rank-by-feature framework. These studies also revealed potential improvement opportunities in terms of design and implementation.

Measuring Data Abstraction Quality in Multiresolution Visualization ∗

Assessing Data Quality Within Available Context

Using Visualization to Improve Clustering Analysis on Heterogeneous Information Network.

Dynamic Mode Decomposition Analysis of Spatially Agglomerated Flow Databases

Cluster-Based Visual Abstraction for Multivariate Scatterplots

Analysis of Statistics Data Based on Mixed Visualization Techniques

Surface Carving-Based Automatic Volume Data Reduction

DICON: Interactive Visual Analysis of Multidimensional Clusters

Visual Analytics of the Spatio-temporal Multidimensional Air Monitoring Data

Adaptive Contextualization Methods for Combating Selection Bias During High-Dimensional Visualization

An Aggregation-Based Overall Quality Measurement for Visualization

Comparison of Data Visualization, Outlier Detection and Data Dimensionality Reduction Methods

Video abstraction based on the visual attention model and online clustering

Evaluating Multi-Dimensional Visualizations for Understanding Fuzzy Clusters.

A spectral method for assessing and combining multiple data visualizations

A three-dimensional display for big data sets

Quantitative effectiveness measures for direct volume rendered images

Guidelines For Pursuing and Revealing Data Abstractions

Measuring Visual Complexity of Cluster-Based Visualizations

Data Quality Measures and Efficient Evaluation Algorithms for Large-Scale High-Dimensional Data

Information Visualization Design for Multidimensional Data: Integrating the Rank-by-Feature Framework with Hierarchical Clustering