Stanley I. M. Ko,Terence T. L. Chong,Pulak Ghosh
Abstract:This paper proposes a new Bayesian multiple change-point model which is based on the hidden Markov approach. The Dirichlet process hidden Markov model does not require the specification of the number of change-points a priori. Hence our model is robust to model specification in contrast to the fully parametric Bayesian model. We propose a general Markov chain Monte Carlo algorithm which only needs to sample the states around change-points. Simulations for a normal mean-shift model with known and unknown variance demonstrate advantages of our approach. Two applications, namely the coal-mining disaster data and the real United States Gross Domestic Product growth, are provided. We detect a single change-point for both the disaster data and US GDP growth. All the change-point locations and posterior inferences of the two applications are in line with existing methods.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the Bayesian framework, construct a multiple change - point model that does not require pre - specifying the number of change points. Specifically, the author proposes the Dirichlet Process Hidden Markov Multiple Change - point Model (DPHMM) based on the hidden Markov method to overcome the need for pre - setting the number of change points in traditional fully parameterized Bayesian models, thereby improving the robustness and adaptability of the model.
### Background and Problem Description of the Paper
In time - series analysis, change - point detection is an important problem, aiming to identify the time points at which the parameters in the data - generating process change significantly. Traditional Bayesian change - point models usually need to pre - set the number of change points, which may lead to model mis - specification and result bias. For example:
- Chernoff and Zacks (1964) assumed that the change - point probability at each time point is constant.
- Smith (1975) studied single - change - point models under different parameter assumptions.
- Carlin et al. (1992) introduced the Markov Chain Monte Carlo (MCMC) method to derive the posterior distribution.
- Chib (1998) allowed the change - point probability to depend on the states between adjacent change points.
However, these methods all require knowing or estimating the number of change points in advance, which is often unknown and difficult to determine in practical applications.
### The Proposed New Method
To solve the above problems, this paper proposes a new Bayesian multiple - change - point model, namely the Dirichlet Process Hidden Markov Multiple Change - point Model (DPHMM). The main features of this model include:
1. **No Need to Pre - set the Number of Change Points**: By using the Dirichlet process as a prior, the model can adaptively determine the number of change points according to the observed data, avoiding pre - setting the number of change points.
2. **Left - to - Right Transition Dynamics**: The model adopts a left - to - right transition mechanism, ensuring that the state can only move forward and not backward.
3. **Efficient MCMC Algorithm**: Only the states near the change points need to be sampled, improving the computational efficiency.
### Specific Form of the Model
In DPHMM, the density function \(p(y_t|Y_{t - 1},\theta)\) of the observed time series \(Y_n=(y_1,y_2,\ldots,y_n)'\) depends on the parameter \(\theta\), which changes in different time periods. Specifically, the model can be represented as:
\[
y_t\sim
\begin{cases}
p(y_t|Y_{t - 1},\theta_1)&\text{if }t\leq\tau_1,\\
p(y_t|Y_{t - 1},\theta_2)&\text{if }\tau_1 < t\leq\tau_2,\\
\vdots\\
p(y_t|Y_{t - 1},\theta_k)&\text{if }\tau_{k - 1}<t\leq\tau_k,\\
p(y_t|Y_{t - 1},\theta_{k + 1})&\text{if }\tau_k < t\leq n,
\end{cases}
\]
where \(\theta_i\in\mathbb{R}^l\) is an l - dimensional vector, \(i = 1,2,\ldots,k + 1\). Introduce the discrete indicator variable \(s_t\), such that \(y_t|s_t\sim p(y_t|Y_{t - 1},\theta_{s_t})\), where \(s_t\) takes values in \(\{1,2,\ldots,k,k + 1\}\).
### Application and Verification
To verify the effectiveness of the proposed model, the author has carried out the following work:
- **Simulation Study**: Conducted Monte Carlo simulations on the normal mean - shift model (with known and unknown variances), demonstrating the advantages of the new method.
- **Practical Application**: Applied the model to coal - mining disaster data and the US GDP growth.