Manifold learning: what, how, and why

Marina Meilă,Hanyu Zhang
2023-11-07
Abstract:Manifold learning (ML), known also as non-linear dimension reduction, is a set of methods to find the low dimensional structure of data. Dimension reduction for large, high dimensional data is not merely a way to reduce the data; the new representations and descriptors obtained by ML reveal the geometric shape of high dimensional point clouds, and allow one to visualize, de-noise and interpret them. This survey presents the principles underlying ML, the representative methods, as well as their statistical foundations from a practicing statistician's perspective. It describes the trade-offs, and what theory tells us about the parameter and algorithmic choices we make in order to obtain reliable conclusions.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the dimensionality reduction of high - dimensional data, especially to discover the low - dimensional structure of data through non - linear dimensionality reduction techniques. Specifically, the paper explores the basic principles, representative methods and statistical basis of Manifold Learning (ML). Manifold Learning is a collection of methods for finding low - dimensional structures from high - dimensional data. These methods not only help to reduce the dimensionality of data, but also reveal the geometric shape of high - dimensional point clouds, enabling data to be visualized, denoised and interpreted. The paper also discusses the trade - offs faced when performing Manifold Learning, and how theory guides us to make parameter and algorithm selections to obtain reliable conclusions. In addition, the paper also covers the applications of Manifold Learning, especially in statistics and scientific research.