Density Ratio Estimation via Sampling along Generalized Geodesics on Statistical Manifolds

Masanari Kimura,Howard Bondell
2024-06-27
Abstract:The density ratio of two probability distributions is one of the fundamental tools in mathematical and computational statistics and machine learning, and it has a variety of known applications. Therefore, density ratio estimation from finite samples is a very important task, but it is known to be unstable when the distributions are distant from each other. One approach to address this problem is density ratio estimation using incremental mixtures of the two distributions. We geometrically reinterpret existing methods for density ratio estimation based on incremental mixtures. We show that these methods can be regarded as iterating on the Riemannian manifold along a particular curve between the two probability distributions. Making use of the geometry of the manifold, we propose to consider incremental density ratio estimation along generalized geodesics on this manifold. To achieve such a method requires Monte Carlo sampling along geodesics via transformations of the two distributions. We show how to implement an iterative algorithm to sample along these geodesics and show how changing the distances along the geodesic affect the variance and accuracy of the estimation of the density ratio. Our experiments demonstrate that the proposed approach outperforms the existing approaches using incremental mixtures that do not take the geometry of the
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to estimate the density ratio between two probability distributions more stably when they are far apart. Specifically, traditional density ratio estimation methods often perform unstably when the source distribution and the target distribution are quite different, resulting in inaccurate estimation results. To overcome this challenge, the paper proposes a Generalized Incremental Mixture Density Ratio Estimation (GIMDRE) method based on generalized geodesics on the statistical manifold. ### Main contributions: 1. **Geometric reinterpretation**: The paper reinterprets the existing Incremental Mixture Density Ratio Estimation (IMDRE) methods from the perspective of information geometry. The author shows that these methods can be understood as sequential density ratio estimations on a specific curve (m - geodesic) on the statistical manifold. 2. **Generalized geodesics**: The paper proposes a method of using generalized geodesics (α - geodesics) for density ratio estimation. α - geodesics are more flexible curves that can connect two probability distributions on the statistical manifold. By choosing an appropriate α value, it can better adapt to density ratio estimation in different situations. 3. **Optimization algorithm**: To implement GIMDRE, the paper develops an alternating optimization algorithm. This algorithm iteratively estimates the density ratio and updates the sampling weights through Monte Carlo sampling and importance weighting techniques, thus solving the problem of interdependence between sampling and density ratio estimation. 4. **Numerical experiments**: The paper designs a series of numerical experiments to verify the effectiveness and behavior of GIMDRE. The experimental results show that, compared with the traditional IMDRE method, GIMDRE can provide more accurate and stable density ratio estimations in various situations. ### Mathematical background: - **Density ratio**: Suppose \(p_s(x)\) and \(p_t(x)\) are the probability density functions of the source distribution and the target distribution respectively, the density ratio is defined as \(r(x)=\frac{p_s(x)}{p_t(x)}\). - **α - geodesics**: On the statistical manifold, α - geodesics are curves connecting two probability distributions, and their form is: \[ \gamma^{(\alpha)}(\lambda)= \begin{cases} \left((1 - \lambda)p(x)^{\frac{1-\alpha}{2}}+\lambda q(x)^{\frac{1-\alpha}{2}}\right)^{\frac{2}{1-\alpha}}, & \text{if }\alpha\neq1\\ \exp\left((1 - \lambda)\ln p(x)+\lambda\ln q(x)\right), & \text{if }\alpha = 1 \end{cases} \] - **α - divergence**: α - divergence is used to measure the difference between two probability distributions, and is defined as: \[ D_{\alpha}[p\|q]=\frac{1}{\alpha(\alpha - 1)}\left(1-\int p(x)^{\alpha}q(x)^{1-\alpha}\,dx\right) \] ### Experimental results: - **Evaluation of different step sizes**: Table 1 shows the GIMDRE evaluation results under different step sizes \(m\). The results show that even when the step size is small, GIMDRE is significantly superior to the traditional method, and as the step size increases, the mean and standard deviation of the estimation results are further improved. - **Influence of different α values**: Tables 2 and 3 show different values under different sample sizes and dimensions.