Deriving Probability Density Functions from Probabilistic Functional Programs

Sooraj Bhat,Johannes Borgström,Andrew D. Gordon,Claudio Russo
DOI: https://doi.org/10.23638/LMCS-13%282%3A16%292017
2017-06-30
Abstract:The probability density function of a probability distribution is a fundamental concept in probability theory and a key ingredient in various widely used machine learning methods. However, the necessary framework for compiling probabilistic functional programs to density functions has only recently been developed. In this work, we present a density compiler for a probabilistic language with failure and both discrete and continuous distributions, and provide a proof of its soundness. The compiler greatly reduces the development effort of domain experts, which we demonstrate by solving inference problems from various scientific applications, such as modelling the global carbon cycle, using a standard Markov chain Monte Carlo framework.
Programming Languages,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to automatically derive probability density functions (PDFs) from probabilistic function programs. Specifically, the authors have developed a compiler that can convert programs in a rich probabilistic programming language into corresponding probability density functions. The significance of this work lies in the fact that many machine - learning methods and techniques, such as maximum - likelihood estimation, maximum - a - posteriori estimation, L2 - estimation, importance sampling, and Markov chain Monte Carlo (MCMC) methods, etc., all require the input of probability density functions. However, in the field of probabilistic function programming, how to directly generate these density functions from programs has always been a challenge, because not all probability distributions have corresponding density functions, and even if they exist, calculating these density functions may be a non - trivial task. ### Main Contributions 1. **Compiler Implementation**: The authors provide the first implementation of a density compiler based on the specification of Bhat et al. (2012), which compiles programs in the probabilistic language Fun into corresponding density functions. 2. **Correctness Proof**: The authors prove the correctness of the compilation algorithm (Theorem 3.15), which is the first such proof for this type of compiler. 3. **Reduction of Development Effort**: The compiler significantly reduces the development workload of domain experts, enabling them to avoid manually writing complex density codes, and the automatically generated code is comparable in performance to the density functions manually written by experts. 4. **Practical Applications**: The authors demonstrate the practical effectiveness of the compiler in various scientific applications, including ecological models. ### Background - **Probability Density Function (PDF)**: In probability theory, the probability density function is a fundamental concept used to describe the probability distribution of continuous random variables. In many machine - learning methods, the PDF is a key component. - **Probabilistic Programming Languages**: Probabilistic programming languages allow data scientists to specify their probability models in a declarative manner, leaving the details of how to convert these models into efficient sampling or inference algorithms to the compiler. - **Complexity of the Problem**: Not all probability distributions have density functions, and even if they exist, calculating these density functions can be very complex. Therefore, automatically deriving density functions from probabilistic programs is an important research topic. ### Methods - **Compiler Design**: The compiler converts programs in the probabilistic programming language Fun into expressions in the target language ∫un, which supports real - valued first - order functions and standard integration. - **Handling Failures and Branches**: The compiler specifically handles the `fail` statement, the `match` statement (and the general `if` statement), pure (i.e., deterministic) `let` bindings, and integer arithmetic. - **Correctness Guarantee**: The correctness of the compiler is proven by Theorem 3.15, ensuring the reliability of the compilation results. ### Conclusion This paper proposes a method for automatically deriving probability density functions from probabilistic function programs and proves its effectiveness and correctness through implementation and verification. This work not only alleviates the burden on domain experts but also provides a powerful tool for the practical application of probabilistic programming languages.