Inference using a composite-likelihood approximation for stochastic metapopulation model of disease spread

Gaël Beaunée,Pauline Ezanno,Alain Joly,Pierre Nicolas,Elisabeta Vergu
2023-11-30
Abstract:Spatio-temporal pathogen spread is often partially observed at the metapopulation scale. Available data correspond to proxies and are incomplete, censored and heterogeneous. Moreover, representing such biological systems often leads to complex stochastic models. Such complexity together with data characteristics make the analysis of these systems a challenge. Our objective was to develop a new inference procedure to estimate key parameters of stochastic metapopulation models of animal disease spread from longitudinal and spatial datasets, while accurately accounting for characteristics of census data. We applied our procedure to provide new knowledge on the regional spread of \emph{Mycobacterium avium} subsp. \emph{paratuberculosis} (\emph{Map}), which causes bovine paratuberculosis, a worldwide endemic disease. \emph{Map} spread between herds through trade movements was modeled with a stochastic mechanistic model. Comprehensive data from 2005 to 2013 on cattle movements in 12,857 dairy herds in Brittany (western France) and partial data on animal infection status in 2,278 herds sampled from 2007 to 2013 were used. Inference was performed using a new criterion based on a Monte-Carlo approximation of a composite likelihood, coupled to a numerical optimization algorithm (Nelder-Mead Simplex-like). Our criterion showed a clear superiority to alternative ones in identifying the right parameter values, as assessed by an empirical identifiability on simulated data. Point estimates and profile likelihoods allowed us to establish the initial state of the system, identify the risk of pathogen introduction from outside the metapopulation, and confirm the assumption of the low sensitivity of the diagnostic test. Our inference procedure could easily be applied to other spatio-temporal infection dynamics, especially when ABC-like methods face challenges in defining relevant summary statistics.
Populations and Evolution,Quantitative Methods,Computation
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to estimate the key parameters of the stochastic metapopulation model of animal disease transmission from longitudinal and spatial datasets while accurately taking into account the characteristics of census data. Specifically, the authors developed a new inference procedure for estimating the key parameters of the regional spread of bovine paratuberculosis (caused by *Mycobacterium avium subsp. paratuberculosis* (Map)) from longitudinal and spatial data collected on a regional scale. This procedure is especially applicable when other similar ABC (approximate Bayesian computation) methods fail due to the difficulty in defining relevant summary statistics. ### Background of the paper - **Problem description**: Spatio - temporal pathogen transmission is usually partially observed on the metapopulation scale. The available data are proxy data, which are incomplete, censored, and heterogeneous. Moreover, representing such biological systems usually leads to complex stochastic models. These complexities combined with the characteristics of the data make analyzing these systems a challenge. - **Research objectives**: - Develop a new inference procedure to estimate the key parameters of the stochastic metapopulation model of animal disease transmission from longitudinal and spatial datasets. - Accurately consider the characteristics of census data. - Apply this procedure to provide new knowledge about the spread of Map on a regional scale. ### Methods and models - **Model**: Use a stochastic mechanistic model to simulate the spread of Map between cattle herds through trade movements. - **Data**: Use comprehensive data from 12,857 dairy farms in Brittany (western France) from 2005 to 2013, and partial infection status data from 2,278 dairy farms from 2007 to 2013. - **Inference method**: Adopt a new criterion based on Monte Carlo approximate composite likelihood and combine it with numerical optimization algorithms (such as the Nelder - Mead Simplex - like algorithm) for inference. ### Main findings - **Parameter estimation**: The new criterion is significantly superior to other methods in identifying the correct parameter values, which is verified by the empirical identifiability of simulated data. - **Infection situation**: In 2005, more than 80% of dairy farms were infected, but the average infection rate was low. - **External risk**: The risk of purchasing infected cattle from outside the metapopulation is moderate and stable (about 0.14). - **Diagnostic test sensitivity**: The average sensitivity of the diagnostic test is low (about 0.21). ### Conclusion This inference procedure can be easily applied to other spatio - temporal infection dynamics, especially for long - standing endemic diseases. This method is particularly useful when ABC - type inference methods fail due to the difficulty in defining relevant summary statistics.