Flexible Heteroscedastic Count Regression with Deep Double Poisson Networks

Spencer Young,Porter Jenkins,Lonchao Da,Jeff Dotson,Hua Wei
2024-10-14
Abstract:Neural networks that can produce accurate, input-conditional uncertainty representations are critical for real-world applications. Recent progress on heteroscedastic continuous regression has shown great promise for calibrated uncertainty quantification on complex tasks, like image regression. However, when these methods are applied to discrete regression tasks, such as crowd counting, ratings prediction, or inventory estimation, they tend to produce predictive distributions with numerous pathologies. Moreover, discrete models based on the Generalized Linear Model (GLM) framework either cannot process complex input or are not fully heterosedastic. To address these issues we propose the Deep Double Poisson Network (DDPN). In contrast to networks trained to minimize Gaussian negative log likelihood (NLL), discrete network parameterizations (i.e., Poisson, Negative binomial), and GLMs, DDPN can produce discrete predictive distributions of arbitrary flexibility. Additionally, we propose a technique to tune the prioritization of mean fit and probabilistic calibration during training. We show DDPN 1) vastly outperforms existing discrete models; 2) meets or exceeds the accuracy and flexibility of networks trained with Gaussian NLL; 3) produces proper predictive distributions over discrete counts; and 4) exhibits superior out-of-distribution detection. DDPN can easily be applied to a variety of count regression datasets including tabular, image, point cloud, and text data.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems of flexibility and calibration of prediction distributions in discrete regression tasks. Specifically, existing methods have the following problems when dealing with discrete regression tasks (such as crowd counting, score prediction, or inventory estimation): 1. **Continuous models applied to discrete tasks**: When continuous regression methods (such as minimizing Gaussian negative log - likelihood, Gaussian NLL) are applied to discrete tasks, pathological problems of prediction distributions will occur. For example: - Assign non - zero probabilities to infeasible real values. - The prediction interval is unbounded, which may lead to a non - zero probability of negative values. - The boundaries of the high - density interval may fall between two valid integers, reducing its interpretability and practicality. 2. **Limitations of Generalized Linear Models (GLM)**: - Discrete models based on the GLM framework either cannot handle complex input data or are not fully heteroscedastic. - For example, the Poisson distribution assumes that the mean and variance are equal (i.e., equal dispersion), which limits the flexibility of the model. - Although the Negative Binomial distribution introduces an additional parameter to break the equal - dispersion assumption, it still assumes that the variance is greater than or equal to the mean (i.e., over - dispersion) and cannot handle under - dispersion cases. 3. **Deficiencies of existing discrete regression models**: - Existing discrete regression models (such as Poisson DNN and Negative Binomial DNN) lack sufficient flexibility to capture prediction distributions under complex input conditions. - These models perform poorly when dealing with complex data (such as images, point clouds, and text). To solve these problems, the paper proposes the **Deep Double Poisson Network (DDPN)**, which is a new discrete neural regression model. The main contributions of DDPN include: - **Fully heteroscedastic**: It can independently predict the mean and dispersion, so as to better adapt to prediction distributions under different input conditions. - **Flexible discrete prediction distribution**: It can handle over - dispersion, under - dispersion, and equal - dispersion cases, and is suitable for various discrete regression tasks. - **Adjustable mean fitting**: By introducing the hyperparameter β, the priority between mean fitting and overall likelihood calibration can be adjusted. - **Applicable to complex data**: It can learn accurate and reliable uncertainty representations on multiple data types such as tabular data, images, point clouds, and text. In general, DDPN aims to provide a more flexible and accurate method for handling prediction distributions in discrete regression tasks, especially when dealing with complex input data.