Deep Reinforcement Learning for Digital Twin-Oriented Complex Networked Systems

Jiaqi Wen,Bogdan Gabrys,Katarzyna Musial
2024-11-09
Abstract:The Digital Twin Oriented Complex Networked System (DT-CNS) aims to build and extend a Complex Networked System (CNS) model with progressively increasing dynamics complexity towards an accurate reflection of reality -- a Digital Twin of reality. Our previous work proposed evolutionary DT-CNSs to model the long-term adaptive network changes in an epidemic outbreak. This study extends this framework by proposeing the temporal DT-CNS model, where reinforcement learning-driven nodes make decisions on temporal directed interactions in an epidemic outbreak. We consider cooperative nodes, as well as egocentric and ignorant "free-riders" in the cooperation. We describe this epidemic spreading process with the Susceptible-Infected-Recovered ($SIR$) model and investigate the impact of epidemic severity on the epidemic resilience for different types of nodes. Our experimental results show that (i) the full cooperation leads to a higher reward and lower infection number than a cooperation with egocentric or ignorant "free-riders"; (ii) an increasing number of "free-riders" in a cooperation leads to a smaller reward, while an increasing number of egocentric "free-riders" further escalate the infection numbers and (iii) higher infection rates and a slower recovery weakens networks' resilience to severe epidemic outbreaks. These findings also indicate that promoting cooperation and reducing "free-riders" can improve public health during epidemics.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to use node decisions driven by deep reinforcement learning (DRL) to optimize cooperation and free - riding behaviors in complex networked systems (CNSs), especially during epidemic outbreaks. Specifically, the research aims to simulate and analyze the behaviors of different types of nodes (cooperative, egocentric, and ignorant) during the epidemic spread process and their impacts on network resilience by constructing a temporal digital twin - oriented complex networked system (DT - CNS). ### Research Background and Problem Description 1. **Digital Twin - Oriented Complex Networked System (DT - CNS)** - Traditional complex networked systems (CNSs) models are difficult to accurately reflect the dynamic changes in the real world. - Digital twin (DT) technology makes the model closer to the real situation in the real world by gradually increasing complexity. - The temporal digital twin - oriented complex networked system (DT - CNS) proposed in this paper aims to further improve the accuracy of the model by introducing the time dimension and real - time feedback mechanism. 2. **Epidemic Spread and Node Behaviors** - During epidemic outbreaks, nodes in the network (such as individuals or organizations) need to make decisions to determine whether to interact with other nodes. - Node behaviors can be divided into three categories: cooperative, egocentric, and ignorant. Cooperative nodes tend to maximize the overall benefits, while egocentric and ignorant nodes may take actions that are not conducive to the collective. - This paper uses the SIR (Susceptible - Infected - Recovered) model to describe the epidemic spread process and studies the impacts of different types of node behaviors on epidemic resilience and the number of infections. 3. **Application of Deep Reinforcement Learning** - Reinforcement learning (RL) allows agents to adjust strategies according to environmental feedback to achieve optimal decisions. - Deep reinforcement learning (DRL) improves the learning speed and performance by combining deep neural networks. - This paper applies DRL to DT - CNS, enabling nodes to make optimal decisions based on real - time information, thereby optimizing the epidemic resilience of the network. ### Main Contributions 1. **Proposing the Temporal DT - CNS Framework**: This framework involves the temporal decisions of nodes during epidemic outbreaks, taking into account the heterogeneous characteristics and connection preferences of nodes. 2. **Introducing Heterogeneous Preference Mutation Styles**: These styles characterize the cooperation and free - riding behaviors of nodes. 3. **Applying Deep Reinforcement Learning Algorithms**: Drive the temporal decisions of nodes to optimize the overall social rewards of the network. 4. **Experimental Results**: The research shows that cooperative nodes can obtain higher rewards and reduce the number of infections; as the number of "free - riding" nodes increases, the epidemic resilience of the network weakens; a higher infection rate and a slower recovery speed will further weaken the network's resilience. ### Conclusion By introducing deep reinforcement learning and the temporal digital twin - oriented complex networked system, this paper provides a new perspective for understanding the impacts of different node behaviors on network resilience during the epidemic spread process. The research results show that promoting cooperation and reducing "free - riding" behaviors can effectively improve public health levels, especially during epidemic outbreaks.