Employing Software Diversity in Cloud Microservices to Engineer Reliable and Performant Systems

Nazanin Akhtarian,Hamzeh Khazaei,Marin Litoiu
2024-07-10
Abstract:In the ever-shifting landscape of software engineering, we recognize the need for adaptation and evolution to maintain system dependability. As each software iteration potentially introduces new challenges, from unforeseen bugs to performance anomalies, it becomes paramount to understand and address these intricacies to ensure robust system operations during the lifetime. This work proposes employing software diversity to enhance system reliability and performance simultaneously. A cornerstone of our work is the derivation of a reliability metric. This metric encapsulates the reliability and performance of each software version under adverse conditions. Using the calculated reliability score, we implemented a dynamic controller responsible for adjusting the population of each software version. The goal is to maintain a higher replica count for more reliable versions while preserving the diversity of versions as much as possible. This balance is crucial for ensuring not only the reliability but also the performance of the system against a spectrum of potential failures. In addition, we designed and implemented a diversity-aware autoscaling algorithm that maintains the reliability and performance of the system at the same time and at any scale. Our extensive experiments on realistic cloud microservice-based applications show the effectiveness of the proposed approach in this paper in promoting both reliability and performance.
Software Engineering,Performance
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to simultaneously improve the reliability and performance of the system by introducing software diversity in the cloud microservice architecture. Specifically, the authors propose a method based on multi - version containers and an adaptive auto - scaling algorithm, aiming to deal with unforeseen errors and performance anomalies that may occur during the software iteration process and ensure the stable operation of the system throughout its life cycle. ### Main Problems 1. **Balance between Reliability and Performance**: - Each software iteration may introduce new challenges, such as unforeseen errors and performance anomalies. - How to maintain the reliability and performance of the system among these changes is a key issue. 2. **Multi - version Management**: - The maintenance and management of multi - version software systems require an intelligent mechanism that can select and deploy among different versions to optimize performance and reliability. - A method is needed to quantify and evaluate the reliability of each version and dynamically adjust the number and distribution of versions according to these evaluation results. 3. **Adaptive Scaling**: - In the case of workload fluctuations, how to dynamically allocate resources to ensure that the reliability and performance of the system are not affected. - The auto - scaling strategy needs to consider multiple factors, such as CPU utilization, response time, and memory usage. ### Solutions To solve the above problems, the paper proposes the following methods: 1. **Conceptualization of Multi - version Containers**: - Transparently handle multi - version containers, so that users do not need to care about the underlying version differences, thereby simplifying the management and maintenance of multi - version systems. 2. **Implementation of Dynamic Controllers**: - Implement a dynamic controller that adjusts the number of replicas according to the reliability score of each software version, ensuring that more reliable versions have a higher number of replicas while trying to maintain version diversity as much as possible. 3. **Diversity Factor (DF)**: - Introduce the diversity factor to quantify the distribution of different versions and advocate a balanced deployment method based on reliability - centric. 4. **Diversity - aware Auto - scaling Algorithm**: - Design and implement a diversity - aware auto - scaling algorithm that can dynamically allocate resources according to real - time workload and version reliability, similar to "survival of the fittest" in natural selection. ### Formula Explanation The key formulas involved in the paper include: - **Reliability Score Calculation Formula**: \[ U_{\text{reliability}}(\theta(t)|\phi)=\sum_{i = 1}^{N}w_i\cdot u_i(\theta_i(t)|\phi) \] where $\theta(t)$ represents the index vector at time $t$, $\phi$ represents the additional parameters that affect the utility function, $w_i$ is the weight of each index, and $u_i(\theta_i(t)|\phi)$ is the individual utility function of each index. - **Diversity Factor Definition**: \[ DF=\frac{1}{\sigma(R)} \] where $\sigma(R)$ is the standard deviation of the replica distribution. For example, for versions $V_1, V_2, V_3$ and the number of replicas $R_1, R_2, R_3$: \[ \sigma(R)=\sqrt{\frac{(R_1-\bar{R})^2+(R_2-\bar{R})^2+(R_3-\bar{R})^2}{3}} \] where $\bar{R}$ is the average number of replicas. Through these methods and formulas, the paper shows how to effectively use software diversity in the cloud microservice architecture to improve the reliability and performance of the system.