On the Structure of Information
Sebastian Gottwald,Daniel A. Braun
2024-09-30
Abstract:Shannon information and Shannon entropy are undoubtedly the most commonly used quantitative measures of information, cropping up in the literature across a broad variety of disciplines, often in contexts unrelated to coding theory. Here, we generalize the original idea behind Shannon entropy as the cost of encoding a sample of a random variable in terms of the required codeword length, to arbitrary loss functions by considering the optimally achievable loss given a certain level of knowledge about the random variable. By formalizing knowledge in terms of the measure-theoretic notion of sub-$\sigma$-algebras, we arrive at a general notion of uncertainty reduction that includes entropy and information as special cases: entropy is the reduction of uncertainty from no (or partial) knowledge to full knowledge about a random variable, whereas information is uncertainty reduction from no (or partial) knowledge to partial knowledge. As examples, we get Shannon information and entropy when measuring loss in terms of message length, variance for square error loss, and more generally, for the Bregman loss, we get Bregman information. Moreover, we show that appealing properties of Shannon entropy and information extend to the general case, including well-known relations involving the KL divergence, which are extended to divergences of proper scoring rules.
Information Theory