Fitness models provide accurate short-term forecasts of SARS-CoV-2 variant frequency

Eslam Abousamra,Marlin D Figgins,Trevor Bedford
DOI: https://doi.org/10.1101/2023.11.30.23299240
2024-05-29
Abstract:Genomic surveillance of pathogen evolution is essential for public health response, treatment strategies, and vaccine development. In the context of SARS-COV-2, multi- ple models have been developed including Multinomial Logistic Regression (MLR) de- scribing variant frequency growth as well as Fixed Growth Advantage (FGA), Growth Advantage Random Walk (GARW) and Piantham parameterizations describing vari- ant Rt. These models provide estimates of variant fitness and can be used to forecast changes in variant frequency. We introduce a framework for evaluating real-time fore- casts of variant frequencies, and apply this framework to the evolution of SARS-CoV-2 during 2022 in which multiple new viral variants emerged and rapidly spread through the population. We compare models across representative countries with different intensities of genomic surveillance. Retrospective assessment of model accuracy high- lights that most models of variant frequency perform well and are able to produce reasonable forecasts. We find that the simple MLR model provides ∼0.6% median ab- solute error and ∼6% mean absolute error when forecasting 30 days out for countries with robust genomic surveillance. We investigate impacts of sequence quantity and quality across countries on forecast accuracy and conduct systematic downsampling to identify that 1000 sequences per week is fully sufficient for accurate short-term fore- casts. We conclude that fitness models represent a useful prognostic tool for short-term evolutionary forecasting.
What problem does this paper attempt to address?