Achievable Stability in Redundancy Systems

Youri Raaijmakers,Sem Borst
DOI: https://doi.org/10.48550/arXiv.2008.03478
2020-08-08
Abstract:We consider a system with $N$ parallel servers where incoming jobs are immediately replicated to, say, $d$ servers. Each of the $N$ servers has its own queue and follows a FCFS discipline. As soon as the first job replica is completed, the remaining replicas are abandoned. We investigate the achievable stability region for a quite general workload model with different job types and heterogeneous servers, reflecting job-server affinity relations which may arise from data locality issues and soft compatibility constraints. Under the assumption that job types are known beforehand we show for New-Better-than-Used (NBU) distributed speed variations that no replication $(d=1)$ gives a strictly larger stability region than replication $(d>1)$. Strikingly, this does not depend on the underlying distribution of the intrinsic job sizes, but observing the job types is essential for this statement to hold. In case of non-observable job types we show that for New-Worse-than-Used (NWU) distributed speed variations full replication ($d=N$) gives a larger stability region than no replication $(d=1)$.
Probability,Performance
What problem does this paper attempt to address?