Deep PDF: Probabilistic Surface Optimization and Density Estimation

Dmitry Kopitkov,Vadim Indelman
DOI: https://doi.org/10.48550/arXiv.1807.10728
2018-09-15
Abstract:A probability density function (pdf) encodes the entire stochastic knowledge about data distribution, where data may represent stochastic observations in robotics, transition state pairs in reinforcement learning or any other empirically acquired modality. Inferring data pdf is of prime importance, allowing to analyze various model hypotheses and perform smart decision making. However, most density estimation techniques are limited in their representation expressiveness to specific kernel type or predetermined distribution family, and have other restrictions. For example, kernel density estimation (KDE) methods require meticulous parameter search and are extremely slow at querying new points. In this paper we present a novel non-parametric density estimation approach, DeepPDF, that uses a neural network to approximate a target pdf given samples from thereof. Such a representation provides high inference accuracy for a wide range of target pdfs using a relatively simple network structure, making our method highly statistically robust. This is done via a new stochastic optimization algorithm, \emph{Probabilistic Surface Optimization} (PSO), that turns to advantage the stochastic nature of sample points in order to force network output to be identical to the output of a target pdf. Once trained, query point evaluation can be efficiently done in DeepPDF by a simple network forward pass, with linear complexity in the number of query points. Moreover, the PSO algorithm is capable of inferring the frequency of data samples and may also be used in other statistical tasks such as conditional estimation and distribution transformation. We compare the derived approach with KDE methods showing its superior performance and accuracy.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of estimating the probability density function (pdf). Specifically, the author proposes a new non - parametric density estimation method - DeepPDF, which is used to estimate the target pdf from sample data. Traditional density estimation techniques such as Kernel Density Estimation (KDE) have some limitations. For example, the expressive ability is limited by the specific type of kernel function or the preset distribution family, and it is slow when querying new points. In addition, the KDE method requires careful parameter adjustment, which increases the difficulty of use. ### What are the main contributions of DeepPDF? 1. **Propose a new Probabilistic Surface Optimization (PSO) algorithm**: PSO utilizes the random nature of sample points to optimize the output of the neural network to be consistent with the target pdf. 2. **Use PSO to approximate the target density through the neural network**: This method not only improves the accuracy of density estimation but also can handle various complex pdfs. 3. **Achieve training in batch mode**: Improve convergence through exponential learning rate decay, and the entire algorithm is called DeepPDF. 4. **Analyze different deep learning aspects of DeepPDF**: Including model structure, optimizer selection, etc. ### What are the advantages of DeepPDF compared to traditional methods? - **Higher expressive ability**: DeepPDF can use a neural network with any architecture, thus having greater flexibility and expressive ability. - **Faster query speed**: Once trained, the complexity of querying new points is only linear, far better than the quasi - linear complexity of the KDE method. - **No need for manual parameter tuning**: DeepPDF does not require the cumbersome parameter search like KDE, simplifying the use process. - **Better statistical robustness**: DeepPDF can perform density estimation without assuming the data distribution, and is suitable for a wider range of data types. ### Formula summary 1. **Loss function in PSO**: \[ L_{\text{pdf}}(\theta, X_U, X_D)=-f(X_U; \theta)\cdot P_D(X_U)+f(X_D; \theta)\cdot [f(X_D; \theta)]_{mf} \] where \(P_D(X_U)\) is the pdf value of the sample point \(X_U\) calculated according to the lower distribution \(P_D\), and \([\cdot]_{mf}\) represents the term that only generates the magnitude coefficient but does not calculate the gradient with respect to \(\theta\). 2. **Expected difference**: \[ E[df(X)]=\delta\cdot\int [F_U(X') - F_D(X'; \theta)]\cdot g(X', X, \theta)dX' \] where \(F_U(X') = P_U(X')\cdot P_D(X')\) and \(F_D(X'; \theta)=P_D(X')\cdot f(X'; \theta)\) are the up - push and down - push forces respectively. Through these improvements, DeepPDF provides an efficient and flexible density estimation method, especially suitable for processing large - scale data sets and complex distributions.