Dimensionality Reduction and Data Visualisation

Hujun Yin
2008-01-01
Abstract:Dimension reduction has long been associated with retinotopic mapping for understanding cortical maps and neural information processing. Multisensory information is perceived, propagated and mapped onto the 2-D cortex in a near-optimal information preserving manner. Data visualization, inspired by this mechanism, is playing an increasingly important role in many applications involving feature and data reduction, from biology, neuroscience, decision support, social science, to management science. The topic has also attracted a great deal of attention in computer vision and pattern recognition. Classic linear methods include principal component analysis (PCA), factor analysis, projection pursuit and independent component analysis. Recently there have been considerable efforts and advances in developing methodologies and techniques for nonlinear dimensionality reduction. A number of novel projection methods have been proposed from statistics, geometry theory and neural networks. Two fundamental approaches are multidimensional scaling and nonlinear PCA. This tutorial will provide an introduction to this challenging and demanding topic. Various methods along these lines such as, self-organising maps, kernel PCA, principal manifold, metric and non-metric scaling, isomap, local linear embedding, Laplacian eigenmap, as well as spectral clustering will be explained and discussed. It will also attempt to unify these methods under a constrained self-organising framework. Examples and applications will be shown to illustrate the usefulness and strengthen of various methods, as well their weakness and limitation.
What problem does this paper attempt to address?