Precise Mobility Intervention for Epidemic Control Using Unobservable Information Via Deep Reinforcement Learning

Tao Feng,Tong Xia,Xiaochen Fan,Huandong Wang,Zefang Zong,Yong Li
DOI: https://doi.org/10.1145/3534678.3539195
2022-01-01
Abstract:To control the outbreak of COVID-19, efficient individual mobility intervention for EPidemic Control (EPC) strategies are of great importance, which cut off the contact among people at epidemic risks and reduce infections by intervening the mobility of individuals. Reinforcement Learning (RL) is powerful for decision making, however, there are two major challenges in developing an RL-based EPC strategy: (1) the unobservable information about asymptomatic infections in the incubation period makes it difficult for RL's decision-making, and (2) the delayed rewards for RL causes the deficiency of RL learning. Since the results of EPC are reflected in both daily infections (including unobservable asymptomatic infections) and long-term cumulative cases of COVID-19, it is quite daunting to design an RL model for precise mobility intervention. In this paper, we propose a Variational hiErarcHICal reinforcement Learning method for Epidemic control via individual-level mobility intervention, namely Vehicle. To tackle the above challenges, Vehicle first exploits an information rebuilding module that consists of a contact-risk bipartite graph neural network and a variational LSTM to restore the unobservable information. The contact-risk bipartite graph neural network estimates the possibility of an individual being an asymptomatic infection and the risk of this individual spreading the epidemic, as the current state of RL. Then, the Variational LSTM further encodes the state sequence to model the latency of epidemic spreading caused by unobservable asymptomatic infections. Finally, a Hierarchical Reinforcement Learning framework is employed to train Vehicle, which contains dual-level agents to solve the delayed reward problem. Extensive experimental results demonstrate that Vehicle can effectively control the spread of the epidemic. Vehicle outperforms the state-of-the-art baseline methods with remarkably high-precision mobility interventions on both symptomatic and asymptomatic infections.
What problem does this paper attempt to address?