Real-time semiparametric regression

Jan Luts,Tamara Broderick,Matt P. Wand
DOI: https://doi.org/10.48550/arXiv.1209.3550
2013-02-06
Abstract:We develop algorithms for performing semiparametric regression analysis in real time, with data processed as it is collected and made immediately available via modern telecommunications technologies. Our definition of semiparametric regression is quite broad and includes, as special cases, generalized linear mixed models, generalized additive models, geostatistical models, wavelet nonparametric regression models and their various combinations. Fast updating of regression fits is achieved by couching semiparametric regression into a Bayesian hierarchical model or, equivalently, graphical model framework and employing online mean field variational ideas. An internet site attached to this article, <a class="link-external link-http" href="http://realtime-semiparametric-regression.net" rel="external noopener nofollow">this http URL</a>, illustrates the methodology for continually arriving stock market, real estate and airline data. Flexible real-time analyses, based on increasingly ubiquitous streaming data sources stand to benefit.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to conduct semi - parametric regression analysis in real - time data stream processing. Traditional semi - parametric regression methods usually adopt the batch - processing approach, that is, processing all at once after collecting all the data. This approach has the disadvantages that statistical analysis needs to wait for the entire data set to be assembled and sometimes the entire data set needs to be stored in memory. The paper proposes a method based on Online Mean Field Variational Bayes (OMVB), which can update the regression fitting results in real - time during the data collection process, thus overcoming the limitations of batch - processing. Specifically, the main contributions of the paper include: 1. **Algorithm Development**: Developed algorithms for real - time semi - parametric regression. These algorithms can process while collecting data and immediately provide results through modern telecommunication technologies. 2. **Model Framework**: Embedded semi - parametric regression into the Bayesian hierarchical model or an equivalent graphical model framework, and achieved rapid updates using the idea of online mean - field variational Bayes. 3. **Application Example**: Demonstrated the application of this method in continuously arriving stock market, real estate, and aviation data through an Internet website (realtime - semiparametric - regression.net). 4. **Model Scope**: The defined semi - parametric regression is very broad, including generalized linear mixed models, generalized additive models, geostatistical models, wavelet non - parametric regression models, and various combinations thereof. The core of the paper is to provide an efficient and flexible real - time data analysis method, which is especially suitable for increasingly common data stream sources. Through this method, immediate analysis and updates during the data collection process can be achieved without the need to store large amounts of data sets.