A New Approach for Evaluating the Performance of Distributed Latency-Sensitive Services

Theodoros Theodoropoulos,John Violos,Antonios Makris,Konstantinos Tserpes

2024-05-01

Abstract:Conventional latency metrics are formulated based on a broad definition of traditional monolithic services, and hence lack the capacity to address the complexities inherent in modern services and distributed computing paradigms. Consequently, their effectiveness in identifying areas for improvement is restricted, falling short of providing a comprehensive evaluation of service performance within the context of contemporary services and computing paradigms. More specifically, these metrics do not offer insights into two critical aspects of service performance: the frequency of latency surpassing specified Service Level Agreement (SLA) thresholds and the time required for latency to return to an acceptable level once the threshold is exceeded. This limitation is quite significant in the frame of contemporary latency-sensitive services, and especially immersive services that require deterministic low latency that behaves in a consistent manner. Towards addressing this limitation, the authors of this work propose 5 novel latency metrics that when leveraged alongside the conventional latency metrics manage to provide advanced insights that can be potentially used to improve service performance. The validity and usefulness of the proposed metrics in the frame of providing advanced insights into service performance is evaluated using a large-scale experiment.

Distributed, Parallel, and Cluster Computing

What problem does this paper attempt to address?

The paper attempts to address the inadequacies of existing latency metrics in evaluating the performance of modern distributed latency-sensitive services. These traditional metrics are based on a broad definition of conventional monolithic services and fail to adequately address the complexities in modern services and distributed computing paradigms. Specifically, they have limitations in two key aspects: 1. **Inability to reflect the frequency of latency exceeding specified Service Level Agreement (SLA) thresholds**: This refers to the frequency of occurrences where latency surpasses the predetermined standards in the system. 2. **Inability to reflect the time required to recover to an acceptable level after latency exceeds the threshold**: This refers to how long it takes for the system to return to normal after the latency exceeds the threshold. These two shortcomings are particularly important for modern latency-sensitive services, especially immersive services requiring deterministic low latency (such as Extended Reality (XR) and Massively Multiplayer Mobile Games (MMG)). Therefore, the paper proposes a new approach by introducing five new latency metrics based on fault tolerance to compensate for the deficiencies of existing metrics, thereby providing deeper insights into service performance and helping to optimize service performance. These new metrics can evaluate the stability and response time of services at a more granular level, especially in the face of high loads and sudden demands.

A New Approach for Evaluating the Performance of Distributed Latency-Sensitive Services

Data-driven Predictive Latency for 5G: A Theoretical and Experimental Analysis Using Network Measurements

Unveiling Latency-Induced Service Degradation: A Methodological Approach With Dataset

Software-Defined Latency Monitoring In Data Center Networks

A Multivariate Characterization and Detection of Software Performance Antipatterns

A Light-Weight Statistical Latency Measurement Platform at Scale

Effectiveness of distributed stateless network server selection under strict latency constraints

Optimizing Microservices Placement in the Cloud-to-Edge Continuum: A Comparative Analysis of App and Service Based Approaches

Evaluating Latency-Sensitive Applications: Performance Degradation in Datacenters with Restricted Power Budget

Measuring and simulating latency in interactive remote rendering systems

Identifying Requirements Affecting Latency in a Softwarized Network for Future 5G and Beyond

Performance of Network and Service Monitoring Frameworks

Battle of Microservices: Towards Latency-Optimal Heuristic Scheduling for Edge Computing

Dynamic Control of Data-Intensive Services over Edge Computing Networks

SSL : A S urrogate-Based Method for Large-Scale S tatistical Latency Measurement

New primitives for bounded degradation in network service

Low Latency Datacenter Networking: A Short Survey

Going Fast and Fair: Latency Optimization for Cloud-Based Service Chains.

Measuring and Evaluating TCP Splitting for Cloud Services

SSL: ASurrogate-Based Method for Large-ScaleStatisticalLatencyMeasurement

LLAMP: Assessing Network Latency Tolerance of HPC Applications with Linear Programming