Resilience and performance quantification of dynamic reconfiguration

Sarah Alhozaimy,Daniel A. Menascé,Massimiliano Albanese
DOI: https://doi.org/10.1016/j.future.2024.05.040
IF: 7.307
2024-05-26
Future Generation Computer Systems
Abstract:Dynamic reconfiguration is an adaptive resilience mechanism that can help address several system design problems. Adaptation through dynamic reconfiguration can improve quality of service, increase fault-tolerance, help recover from failures, and prevent and recover from cyber attacks. This mechanism acts primarily by reconfiguring one or more of a system's resources. While system reconfiguration is advantageous, it may bring disadvantages such as performance and availability degradation during reconfiguration intervals. In this work, we quantify the effectiveness of dynamic reconfiguration as a system resilience mechanism and its impact on performance. We define a failure function that captures the effect of dynamic reconfigurations on a system's resilience to failures and develop metrics that capture the impact of reconfigurations on a system's execution time and probability of failure. We also derive analytic models that predict the effectiveness of dynamic reconfigurations on execution time and resilience to failures. Several theorems regarding the tradeoff between resilience to failures and performance and availability are presented. Finally, we define an optimization problem, formalized with the help of these theorems, to determine the optimal reconfiguration frequency to meet performance-resilience tradeoffs.
computer science, theory & methods
What problem does this paper attempt to address?