Abstract:Data sets of multivariate normal distributions abound in many scientific areas like diffusion tensor imaging, structure tensor computer vision, radar signal processing, machine learning, just to name a few. In order to process those normal data sets for downstream tasks like filtering, classification or clustering, one needs to define proper notions of dissimilarities between normals and paths joining them. The Fisher-Rao distance defined as the Riemannian geodesic distance induced by the Fisher information metric is such a principled metric distance which however is not known in closed-form excepts for a few particular cases. In this work, we first report a fast and robust method to approximate arbitrarily finely the Fisher-Rao distance between multivariate normal distributions. Second, we introduce a class of distances based on diffeomorphic embeddings of the normal manifold into a submanifold of the higher-dimensional symmetric positive-definite cone corresponding to the manifold of centered normal distributions. We show that the projective Hilbert distance on the cone yields a metric on the embedded normal submanifold and we pullback that cone distance with its associated straight line Hilbert cone geodesics to obtain a distance and smooth paths between normal distributions. Compared to the Fisher-Rao distance approximation, the pullback Hilbert cone distance is computationally light since it requires to compute only the extreme minimal and maximal eigenvalues of matrices. Finally, we show how to use those distances in clustering tasks.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problem of distance measurement between multivariate normal distribution data sets. In particular, when dealing with these data sets for downstream tasks (such as filtering, classification or clustering), how to define appropriate difference measures and connect their smooth paths. Specifically, the paper focuses on the following two main problems: 1. **Calculation of Fisher - Rao distance**: - The Fisher - Rao distance is the Riemannian geodesic distance induced by the Fisher information metric. It has no closed - form solution in the case of multivariate normal distribution, except for a few special cases (such as the one - dimensional case or the case where the mean/covariance is the same). Therefore, the paper first proposes a fast and robust method to approximate the Fisher - Rao distance between multivariate normal distributions and guarantees an accuracy of \(1+\epsilon\), where \(\epsilon > 0\). 2. **Introduction of new distance measures**: - The paper introduces a class of distance measures based on differential embeddings on high - dimensional symmetric positive - definite cone sub - manifolds. Specifically, by embedding the normal manifold into a higher - dimensional symmetric positive - definite cone and using the projected Hilbert distance to obtain the metric distance on the embedded sub - manifold. Then, pull this cone distance and its associated straight - line Hilbert cone geodesic back to between normal distributions to obtain the distance and smooth path. - This new distance measure (called pullback Hilbert cone distance) is more computationally lightweight because it only needs to calculate the minimum and maximum eigenvalues of the matrix. ### Main contributions of the paper - **Approximation method of Fisher - Rao distance**: Provides a fast and accuracy - guaranteed method to approximate the Fisher - Rao distance. - **Introduction of new distance measures**: Proposes the pullback Hilbert cone distance, which is simple to calculate and has good geometric properties. - **Application examples**: Shows how to use these distances and paths in clustering tasks, thereby simplifying and quantifying Gaussian mixture models (GMMs). ### Formula summary 1. **Fisher - Rao distance**: \[ \rho_{FR}(N_0, N_1)=\int_{0}^{1}ds_{Fisher}(\gamma_{FR}(N_0, N_1; t))dt \] where \(ds_{Fisher}\) is the Fisher - Rao geodesic element. 2. **Pullback Hilbert cone distance**: - Embedding mapping \(f_a(N(\mu,\Sigma))=\begin{bmatrix}\Sigma + a\mu\mu^{\top}&a\mu\\a\mu^{\top}&a\end{bmatrix}\) - Projected Hilbert distance: \[ \rho_P(N_0, N_1)=\sqrt{\sum_{i = 1}^{d + 1}\log^{2}\lambda_i(N_0^{-1/2}N_1N_0^{-1/2})} \] Through these methods, the paper provides effective tools and theoretical support for the processing of multivariate normal distribution data sets.

Fisher-Rao distance and pullback SPD cone distances between multivariate normal distributions

A numerical approximation method for the Fisher-Rao distance between multivariate normal distributions

The Fisher Geometry and Geodesics of the Multivariate Normals, without Differential Geometry

Fisher-Rao distance on the covariance cone

Approximation and bounding techniques for the Fisher-Rao distances between parametric statistical models

Distances and Riemannian Metrics for Multivariate Spectral Densities

The Fisher-Rao geometry of CES distributions

Fast proxy centers for Jeffreys centroids: The Jeffreys-Fisher-Rao and the inductive Gauss-Bregman centers

Learning Distances from Data with Normalizing Flows and Score Matching

Pulling back symmetric Riemannian geometry for data analysis

Statistics on the Manifold of Multivariate Normal Distributions: Theory and Application to Diffusion Tensor MRI Processing

Optimal Transport vs. Fisher-Rao distance between Copulas for Clustering Multivariate Time Series

Conal Distances Between Rational Spectral Densities

Computing distances and means on manifolds with a metric-constrained Eikonal approach

Manifold Diffusion Geometry: Curvature, Tangent Spaces, and Dimension

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

Hilbert Curve Projection Distance for Distribution Comparison

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds

Stereographic Spherical Sliced Wasserstein Distances

A Normalized Bottleneck Distance on Persistence Diagrams and Homology Preservation under Dimension Reduction

A Wasserstein-Type Distance for Gaussian Mixtures on Vector Bundles with Applications to Shape Analysis