Hermes: Improving Server Utilization by Colocation-Aware Runtime Systems.

Shenming Liu,Kaiming Li,Hao Huang
DOI: https://doi.org/10.1109/hpcc/smartcity/dss.2019.00131
2019-01-01
Abstract:Improving server utilization is increasingly important to service providers. Latency-critical services have strict tail latency service-level objects and safe colocation of the latency-critical service with other workloads on the same machine is difficult. This would underutilize server resources. We present Hermes, a user-level resource management layer to address this dilemma. We implement two kinds of runtime systems in Hermes, one for latency-critical workloads (LC runtime) and one for best-effort workloads (BE runtime). LC runtime implements user-level thread management and controls the dedicated computing resources occupied by latency-critical workload through a feedback-based controller. BE runtimes schedule threads of best-effort workloads to take advantage of simultaneous multithreading technology. Runtime systems are aware of their colocation and work in a cooperative approach to improve server utlization without violating the tail latency service-level objects of the latency-critical workload. Hermes is implemented entirely at user-level on Linux. We evaluate Hermes using memcached and several sythetic micro-benchmarks, the result shows that Hermes could achieve both safe colocation and improvement of core utilization.
What problem does this paper attempt to address?