Threshold-based rerouting and replication for resolving job-server affinity relations

Youri Raaijmakers,Sem Borst,Onno Boxma
DOI: https://doi.org/10.1109/infocom42981.2021.9488909
2021-05-10
Abstract:We consider a system with several job types and two parallel server pools. Within the pools the servers are homogeneous, but across pools possibly not in the sense that the service speed of a job may depend on its type as well as the server pool. Immediately upon arrival, jobs are assigned to a server pool, possibly based on (partial) knowledge of their type. In case such knowledge is not available upon arrival, it can however be obtained while the job is in service; as the service progresses, the likelihood that the service speed of this job type is low increases, creating an incentive to execute the job on different, possibly faster, server(s). Two policies are considered: reroute the job to the other server pool, or replicate it there. We determine the effective load per server under both the rerouting and replication policy for completely unknown as well as partly known job types. We also examine the impact of these policies on the stability bound, which is defined as the maximum arrival rate of jobs for which the effective load per server is smaller than one. We demonstrate that the uncertainty in job types may significantly reduce the stability bound, and that for (highly) unbalanced service speeds full replication achieves the largest stability bound. Finally, we discuss how the use of threshold-based policies can help improve the expected latency for completely or partly unknown job types.
What problem does this paper attempt to address?