Divergence Function, Duality, and Convex Analysis
Jun Zhang
DOI: https://doi.org/10.1162/08997660460734047
IF: 3.278
2004-01-01
Neural Computation
Abstract:From a smooth, strictly convex function : Rn R, a parametric family of divergence function D() may be introduced: for x, y, int dom() and for R, with D(1 defined through taking the limit of . Each member is shown to induce an -independent Riemannian metric, as well as a pair of dual -connections, which are generally nonflat, except for = 1. In the latter case, D(1) reduces to the (nonparametric) Bregman divergence, which is representable using and its convex conjugate and becomes the canonical divergence for dually flat spaces (Amari, 1982, 1985; Amari & Nagaoka, 2000). This formulation based on convex analysis naturally extends the information-geometric interpretation of divergence functions (Eguchi, 1983) to allow the distinction between two different kinds of duality: referential duality ( -) and representational duality ( ). When applied to (not necessarily normalized) probability densities, the concept of conjugated representations of densities is introduced, so that -connections defined on probability densities embody both referential and representational duality and are hence themselves bidual. When restricted to a finite-dimensional affine submanifold, the natural parameters of a certain representation of densities and the expectation parameters under its conjugate representation form biorthogonal coordinates. The alpha representation (indexed by now, 1, 1) is shown to be the only measure-invariant representation. The resulting two-parameter family of divergence functionals D(, ), (, ) 1, 1 -1, 1 induces identical Fisher information but bidual alpha-connection pairs; it reduces in form to Amari's alpha-divergence family when =1 or when = 1, but to the family of Jensen difference (Rao, 1987) when = 1.
Medicine