Sufficient Statistics and Split Idempotents in Discrete Probability Theory

Bart Jacobs
DOI: https://doi.org/10.46298/entics.10520
2023-02-20
Abstract:A sufficient statistic is a deterministic function that captures an essential property of a probabilistic function (channel, kernel). Being a sufficient statistic can be expressed nicely in terms of string diagrams, as Tobias Fritz showed recently, in adjoint form. This reformulation highlights the role of split idempotents, in the Fisher-Neyman factorisation theorem. Examples of a sufficient statistic occur in the literature, but mostly in continuous probability. This paper demonstrates that there are also several fundamental examples of a sufficient statistic in discrete probability. They emerge after some combinatorial groundwork that reveals the relevant dagger split idempotents and shows that a sufficient statistic is a deterministic dagger epi.
Logic in Computer Science
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to define and understand sufficient statistics in discrete probability theory. Specifically, the author hopes to explore the concept of sufficient statistics within the framework of discrete probability and reveal the related split idempotents through the basic work of combinatorics, thereby proving that the sufficient statistic is a deterministic Dagger epi. ### Main Objectives of the Paper 1. **Expand the Application Range of Sufficient Statistics**: Although most of the examples in the existing literature are concentrated in continuous probability theory, this paper aims to show that there are also basic examples of sufficient statistics in discrete probability theory. 2. **Gain an In - depth Understanding of the Essence of Sufficient Statistics**: Not only verify the sufficiency conditions in the Fisher - Neyman decomposition theorem, but also explain the underlying mechanism through split idempotents to provide a more comprehensive understanding. 3. **Formal Description**: Use the formal tools of string diagrams and split idempotents to re - formulate the concept of sufficient statistics. This helps to clarify the essence of sufficient statistics from an abstract perspective. ### Key Points of the Solution - **The Role of Split Idempotents**: Split idempotents play a crucial role in the Fisher - Neyman decomposition theorem and can help identify and construct sufficient statistics. - **Combinatorics Foundation**: Through the work of combinatorics, especially the research on multisets and multiset partitions, the split idempotents related to sufficient statistics are revealed. - **Dagger Reflection**: It is proved that in discrete probability, a sufficient statistic can be represented as a deterministic Dagger reflection, which is achieved through split Dagger idempotents. ### Specific Examples - **Accumulation Function as a Sufficient Statistic**: For a sequence of independent and identically distributed (iid) elements, the accumulation function can be regarded as a sufficient statistic. - **Multiplicity Count Function as a Sufficient Statistic**: A new sufficient - statistic situation is introduced, based on the multiplicity count function, which only focuses on the frequency of elements rather than the specific element values. Through these efforts, the paper not only fills the gap in the study of sufficient statistics in discrete probability theory, but also provides a deeper understanding of this concept, especially new insights in terms of split idempotents and Dagger reflection. ### Summary The core problem of this paper is to re - define and understand sufficient statistics through specific examples and abstract tools (such as split idempotents and Dagger reflection) in discrete probability theory. This not only expands the application range of sufficient statistics, but also provides a deeper understanding of their essence.