Katie Buchhorn,Kerrie Mengersen,Edgar Santos-Fernandez,Erin E. Peterson,James M. McGree
Abstract:Optimal design facilitates intelligent data collection. In this paper, we introduce a fully Bayesian design approach for spatial processes with complex covariance structures, like those typically exhibited in natural ecosystems. Coordinate Exchange algorithms are commonly used to find optimal design points. However, collecting data at specific points is often infeasible in practice. Currently, there is no provision to allow for flexibility in the choice of design. We also propose an approach to find Bayesian sampling windows, rather than points, via Gaussian process emulation to identify regions of high design efficiency across a multi-dimensional space. These developments are motivated by two ecological case studies: monitoring water temperature in a river network system in the northwestern United States and monitoring submerged coral reefs off the north-west coast of Australia.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to develop a flexible and efficient spatial process sampling design method, especially in natural ecosystems with complex covariance structures. Specifically, the paper proposes a fully Bayesian design method for dealing with complex covariance structures in spatial processes, such as data collection problems in river networks and coral reef systems. Traditional methods usually rely on specific sampling points, but in actual operation, due to factors such as terrain and accessibility, it is often not feasible to accurately collect samples at these points. Therefore, this study proposes a new method to find "sampling windows" instead of specific points, and uses Gaussian process simulation to identify regions with high design efficiency in multi - dimensional space. This method not only improves the flexibility of data collection, but also ensures the efficiency of information.
### Main Objectives:
1. **Determine the Optimal Sampling Locations**: Based on the available set of discrete locations, determine a set of optimal sampling locations.
2. **Form Sampling Windows**: In the specified area, identify the approximately optimal design - efficiency regions near locations that are difficult to reach due to access or other reasons.
### Application Background:
- **River Network Monitoring**: Take a river network system in the northwest of the United States as an example to monitor water temperature changes and understand the impact of climate change and land management on the thermal environment.
- **Coral Reef Monitoring**: Take the diving coral reef on the northwest coast of Australia as an example to monitor coral coverage and evaluate the ecological health status.
### Method Innovations:
- **Optimal Design under the Bayesian Framework**: Use the Bayesian method to deal with the uncertainty of model parameters and statistical models, and use Gaussian process simulation to approximate the utility surface in high - dimensional space.
- **Coordinate Exchange Algorithm**: Propose an improved random coordinate exchange algorithm, and use non - parametric acceptance criteria (such as Wilcoxon rank - sum test) to compare the proposed design with the current best design.
### Formula Analysis:
- **Kernel Function of Gaussian Process**:
\[
k(x_p, x_s)=\sum_{j = 1}^{q}\exp(-\zeta_j\cdot\text{dist}_j(x_p, x_s))
\]
where \(\zeta\) is a hyperparameter, and \(\text{dist}_j(x_p, x_s)=|x_j^p - x_j^s|\) is the distance metric.
- **Design Efficiency**:
\[
\text{eff}(d^\star_j)=\frac{\bar{f}(d^\star_j|D,\hat{\zeta})}{\bar{f}(\hat{d}^\star_j|D,\hat{\zeta})}
\]
where \(\bar{f}(d^\star_j|D,\hat{\zeta})\) is the posterior predictive mean, and \(\hat{d}^\star_j = \arg\max_{d^\star_j\in D^\star}\bar{f}(d^\star_j|D,\hat{\zeta})\) is the optimal design point.
### Conclusion:
This study provides a flexible and efficient spatial process sampling design method, which is especially suitable for natural ecosystems with complex covariance structures. By introducing the concept of "sampling window", this method not only improves the flexibility of data collection, but also ensures the efficiency of information, providing strong support for environmental monitoring and management.