SymS: a symmetrical scheduler to improve multi‐threaded program performance on NUMA systems

Liang Zhu,Hai Jin,Xiaofei Liao
DOI: https://doi.org/10.1002/cpe.3638
2015-01-01
Abstract:The nonuniform memory access (NUMA) architecture has been used extensively in data centers. Most of the previous works used single-threaded multiprogrammed workloads to study the performance of NUMA systems, which mainly focus on two classes of problems: resource contention and data locality. However, when running multi-threaded programs on NUMA systems, the critical thread of these programs significantly influences the system performance and brings new challenges that are different from those in a single-threaded situation. In particular, an additional scheduling scheme is desired to avoid the performance degradation caused by the critical thread of multi-threaded programs running on NUMA systems. This work presents a scheduler, Symmetrical Scheduler, which successfully solves the lagging problem by balancing the number of the costly remote shared data accesses for threads on NUMA systems. To the best of our knowledge, little work has been conducted to examine the performance impacted by the critical thread of multi-threaded programs on NUMA systems. By running the PARSEC benchmark on such systems, our methodology can improve the program performance by a factor of 6% on average and achieve maximally 25.3% improvement compared with Linux kernel scheduling mechanism. Copyright (C) 2015 John Wiley & Sons, Ltd.
What problem does this paper attempt to address?