Unsupervised Discovery of Subspace Trends

Yan Xu,Peng Qiu,Badrinath Roysam
DOI: https://doi.org/10.1109/tpami.2015.2394475
IF: 23.6
2015-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:This paper presents unsupervised algorithms for discovering previously unknown subspace trends in high-dimensional data sets without the benefit of prior information. A subspace trend is a sustained pattern of gradual/progressive changes within an unknown subset of feature dimensions. A fundamental challenge to subspace trend discovery is the presence of irrelevant data dimensions, noise, outliers, and confusion from multiple subspace trends driven by independent factors that are mixed in with each other. These factors can obscure the trends in conventional dimension reduction & projection based data visualizations. To overcome these limitations, we propose a novel graph-theoretic neighborhood similarity measure for detecting concordant progressive changes across data dimensions. Using this measure, we present an unsupervised algorithm for trend-relevant feature selection, subspace trend discovery, quantification of trend strength, and validation. Our method successfully identified verifiable subspace trends in diverse synthetic and real-world biomedical datasets. Visualizations derived from the selected trend-relevant features revealed biologically meaningful hidden subspace trend(s) that were obscured by irrelevant features and noise. Although our examples are drawn from the biological domain, the proposed algorithm is broadly applicable to exploratory analysis of high-dimensional data including visualization, hypothesis generation, knowledge discovery, and prediction in diverse other applications.
What problem does this paper attempt to address?