Abstract:We present a theoretical foundation regarding the boundedness of the t-SNE algorithm. t-SNE employs gradient descent iteration with Kullback-Leibler (KL) divergence as the objective function, aiming to identify a set of points that closely resemble the original data points in a high-dimensional space, minimizing KL divergence. Investigating t-SNE properties such as perplexity and affinity under a weak convergence assumption on the sampled dataset, we examine the behavior of points generated by t-SNE under continuous gradient flow. Demonstrating that points generated by t-SNE remain bounded, we leverage this insight to establish the existence of a minimizer for KL divergence.

What problem does this paper attempt to address?

The paper primarily focuses on the behavioral characteristics of the t-SNE (t-distributed Stochastic Neighbor Embedding) algorithm when sampling point clouds on manifolds, and specifically studies the convergence analysis of t-SNE as a gradient flow. Below is a summary of the main issues the paper attempts to address: 1. **Research Background and Motivation**: - t-SNE is a widely used nonlinear dimensionality reduction method for data visualization. - Despite its excellent performance in practice, the theoretical support for t-SNE is relatively limited, especially regarding its iterative nature. 2. **Core Issues**: - **Divergence during the Iterative Process**: Investigating whether the t-SNE algorithm produces data points that diverge to infinity when handling high-dimensional data. - **Existence of Global Minimum**: Proving whether the Kullback-Leibler (KL) divergence used by t-SNE has a global minimum under specific conditions. 3. **Main Contributions**: - **Limitation of Divergence**: Through analysis, it is proven that the embedded points generated by t-SNE are uniformly bounded in a 2-dimensional space. - **Existence of Global Minimum**: Based on the above results, it is further proven that the KL divergence has a global minimum. 4. **Technical Details**: - **Discussion of Perplexity Parameter**: The paper discusses in detail the properties of the perplexity parameter in t-SNE, including its range, uniqueness, and stability. - **Continuous Gradient Flow**: Viewing t-SNE as a continuous gradient flow, key conclusions are derived by analyzing the gradient flow equations. - **Changes in Mutual Distances**: Using the gradient flow equations and the structure of the perplexity parameter, the changes in mutual distances between embedded points are studied to infer overall behavior. 5. **Organizational Structure**: - The paper first reviews the basic principles of the t-SNE algorithm. - Then, it proposes the assumptions used for analyzing t-SNE, including assumptions about the support set of the input dataset. - Next, it delves into the properties of the perplexity parameter. - Finally, the paper elaborates on the main theorems and their proofs in detail. In summary, this paper aims to fill the gap in the theoretical foundation of the t-SNE algorithm. Through mathematical analysis, it proves key properties of the t-SNE algorithm, thereby providing a more solid theoretical support for its practical application.

Convergence analysis of t-SNE as a gradient flow for point cloud on a manifold

Deep Manifold Computing and Visualization Using Elastic Locally Isometric Smoothness

An Analysis of the t-SNE Algorithm for Data Visualization

Convergence Analysis of Gradient Algorithms on Riemannian Manifolds Without Curvature Constraints and Application to Riemannian Mass

t-SNE, Forceful Colorings and Mean Field Limits

Convergence of Laplacian Spectra from Point Clouds

Dimension reduction and the gradient flow of relative entropy

Dataset Denoising Based on Manifold Assumption

Convergence analysis of gauss-type proximal point method for metrically regular mappings

Laplacian-based Cluster-Contractive t-SNE for High-Dimensional Data Visualization

Convergence analysis of the transformed gradient projection algorithms on compact matrix manifolds

A Preprocessing Manifold Learning Strategy Based on t-Distributed Stochastic Neighbor Embedding

Discretized Gradient Flow for Manifold Learning in the Space of Embeddings

T-Sne for Complex Multi-Manifold High-Dimensional Data

A convergence analysis of the perturbed compositional gradient flow: averaging principle and normal deviations

Convergence of Laplacian Spectra from Random Samples

Gradient flows on graphons: existence, convergence, continuity equations

Convergence of the Weighted Nonlocal Laplacian on Random Point Cloud

Convergence and non-convergence in a nonlocal gradient flow

Local Conditions for Global Convergence of Gradient Flows and Proximal Point Sequences in Metric Spaces

A Convergence Rate for Manifold Neural Networks