Abstract:Various probabilistic time series forecasting models have sprung up and shown remarkably good performance. However, the choice of model highly relies on the characteristics of the input time series and the fixed distribution that the model is based on. Due to the fact that the probability distributions cannot be averaged over different models straightforwardly, the current time series model ensemble methods cannot be directly applied to improve the robustness and accuracy of forecasting. To address this issue, we propose pTSE, a multi-model distribution ensemble method for probabilistic forecasting based on Hidden Markov Model (HMM). pTSE only takes off-the-shelf outputs from member models without requiring further information about each model. Besides, we provide a complete theoretical analysis of pTSE to prove that the empirical distribution of time series subject to an HMM will converge to the stationary distribution almost surely. Experiments on benchmarks show the superiority of pTSE overall member models and competitive ensemble methods.

What problem does this paper attempt to address?

### Problems the paper attempts to solve The paper aims to solve the problem of model integration in probabilistic time - series prediction. Specifically, the existing time - series model integration methods cannot be directly applied to improve the robustness and accuracy of prediction, because the probability distributions of different models cannot be directly averaged. To overcome this challenge, the authors propose pTSE (probabilistic Time Series Ensemble), a multi - model distribution integration method based on Hidden Markov Model (HMM). ### Main contributions 1. **Propose pTSE**: pTSE is a multi - model integration method for probabilistic time - series prediction. It only requires the output of member models and no further information. 2. **Theoretical verification**: The authors theoretically prove the integrated distribution discovered by pTSE, that is, the distribution that the time series approximately follows in any time period. 3. **Empirical results**: Through experiments on real - world datasets, it is proved that the performance of pTSE is better than that of a single model and other integration methods for point - estimate models. ### Method overview #### 2.1 Preliminaries - **HMM**: Hidden Markov Model (HMM) is a model that describes the joint probability of a set of random variables. The observed variable \(O_t\) can be continuous or discrete, and the hidden state \(S_t\) corresponds to each \(O_t\). HMM satisfies the following conditions: \[ p(S_{t + 1}|S_1,\ldots,S_t)=p(S_{t + 1}|S_t) \] \[ p(O_t|S_1,\ldots,S_T,O_1,\ldots,O_T)=p(O_t|S_t) \] - **HMM fitting**: Fitting HMM requires the determination of the following parameters: - Transition matrix \(A=(a_{i,j})_{1\leq i,j\leq K}\), where \(a_{ij}=p(S_{t + 1}=j|S_t = i)\) - Set of emission function parameters \(\Theta=\{\theta_k\}_{k = 1}^K\) - Initial distribution \(\pi=(\pi_1,\ldots,\pi_K)\), where \(\pi_k=p(S_0 = k)\) The fitting process usually uses maximum likelihood estimation (MLE) or an equivalent form: \[ \arg\max_{A,\pi,\Theta}p(\{O_t\}_{t = 1}^T|A,\pi,f_k(O_t;\theta_k\in\Theta)) \] #### 2.2 Framework foundation - **Problem definition**: Probabilistic prediction problems usually need to estimate the conditional distribution \(p(y_t|M(X_t))\) given the training model \(M\) and the feature vector \(X_t\). - **Model assumption**: Suppose there are \(K\) probabilistic prediction models \(\{M_k\}_{k = 1}^K\) independently fitted to the same dataset \(\{y_t\}_{t = 1}^T\). At each time point \(t\), there is an optimal model \(M_{k_t}\) such that the distribution of \(y_t\) is determined by \(M_{k_t}(X_t)\), that is, \(y_t\sim p(y_t|M_{k_t}(X_t))\). - **Model transfer**: For \(y_{t + 1}\), assume that \(M_{k_t}\) will randomly transfer to a new optimal model \(M_{k_{t+1}}\) with probability \(p_{k_t,k_{t+1}}\), which is a Markov process. #### 2.3 Mixed quantile estimation - **Problem definition**: For probabilistic prediction methods, usually the PDF \(f_{X_t}^{M_k}(y_t)\) is not directly evaluated, but instead the...

pTSE: A Multi-model Ensemble Method for Probabilistic Time Series Forecasting

A Multi-model Combination Approach for Probabilistic Wind Power Forecasting.

ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons

Probabilistic Time Series Forecasting Based on Similar Segment Importance in the Process Industry

Bayesian optimization based dynamic ensemble for time series forecasting

Multi-output Ensembles for Multi-step Forecasting

Frequency Ensemble Based on Multi-Band Checkpoint Saving Mechanism for Time Series Forecasting

Enhancing Time Series Forecasting: A Hierarchical Transformer with Probabilistic Decomposition Representation

A Hidden Markov Model-based fuzzy modeling of multivariate time series

Short-Term Wind Speed Forecasting Using a Multi-model Ensemble.

Ensemble Modeling for Time Series Forecasting: an Adaptive Robust Optimization Approach

Synergetic Learning of Heterogeneous Temporal Sequences for Multi-Horizon Probabilistic Forecasting

Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Parallel Extraction of Long-term Trends and Short-term Fluctuation Framework for Multivariate Time Series Forecasting

ST-MoE: Spatio-Temporal Mixture of Experts for Multivariate Time Series Forecasting.

Ensemble Probabilistic Wind Power Forecasting with Multi-Scale Features

A Novel Bayesian Ensembling Model for Wind Power Forecasting

Dynamic Non-Constraint Ensemble Model for Probabilistic Wind Power and Wind Speed Forecasting

BayesTSF: Measuring Uncertainty Estimation in Industrial Time Series Forecasting from a Bayesian Perspective

A Hybrid Method with Adaptive Sub-Series Clustering and Attention-Based Stacked Residual LSTMs for Multivariate Time Series Forecasting

An Ensemble Model Based on Adaptive Noise Reducer and Over-Fitting Prevention LSTM for Multivariate Time Series Forecasting.