Stacked Denoising Autoencoders Based Poisson Regression For Count Data Modeling

Xinmin Zhang,Ying Liu,Zhihuan Song,Zheren Zhu,Chihang Wei
DOI: https://doi.org/10.1109/ddcls55054.2022.9858406
2022-01-01
Abstract:Data-driven virtual-sensors or soft-sensors are important tools for predicting quality variables or KPIs in many industrial processes. However, the existing virtual-sensors models are generally based on the assumption that the response variable or model error structure satisfies normality and homoscedasticity. But, in many practical applications, the response variable of interest is a nonnegative integer or count that we want to model or analyze based on a set of explanatory variables. The count data usually violate these assumptions and exhibit heteroscedasticity and skewed distribution. To model and analyze count data, this paper proposes a stacked denoising autoencoders-based Poisson regression (SDAE-PR) model. In SDAE-PR, the stacked denoising autoencoders are adopted to extract the high-level feature representation of the data, and Poisson regression is then performed on this representation. Unlike the conventional Poisson regression model which use hand-crafted features to build the model, SDAE-PR can extract high-level feature representations, which not only helps to improve the prediction accuracy of the Poisson regression model, but also is more robust to noise; In addition, SDAE-PR inherits the merits of Poisson regression that can ensure the non-negativity for the prediction of the response variable, which is a key for the count data modeling and analysis. The experimental results demonstrated that the proposed SDAE-PR model is more accurate than the other state-of-the-art methods in terms of prediction accuracy.
What problem does this paper attempt to address?