Reliable Representation of Data on Manifolds

Jun Li,Pengwei Hao
DOI: https://doi.org/10.5244/c.22.113
2008-01-01
Abstract:The manifold learning algorithms are promising data analysis tools. However, to fit an unseen point in a learned model, the point must be located in the training set, which limits its scalability. In this paper, we discuss how to select landmarks from the data to help locate the test points. Our method is for data on manifolds: the way the landmarks represent the data in the ambient space should resemble the way they represent the data on the manifold. Compared to the previous research, (i) Our test foregoes the requirement of knowing the intrinsic manifold dimension and thus is more applicable and robust. (ii) Our selection implies a provable topology preservation property. (iii) We also provide a way to improve existing landmarks. Experiments on the synthetic data and the real data have been done. The results support the proposed properties and algorithms.
What problem does this paper attempt to address?