Characterizing Job Microarchitectural Profiles at Scale: Dataset and Analysis

Kangjin Wang,Ying Li,Cheng Wang,Tong Jia,Kingsum Chow,Yang Wen,Yaoyong Dou,Guoyao Xu,Chuanjia Hou,Jie Yao,Liping Zhang
DOI: https://doi.org/10.1145/3545008.3545026
2022-01-01
Abstract:Understanding the microarchitectural resource characteristics of datacenter jobs has become increasingly critical to guarantee the performance of jobs while improving resource utilization. Prior work studied the resource characteristics of datacenter jobs at the OS level, little reveals the deep and detailed characteristics at the microarchitecture level due to the lack of related open traces. In this paper, we provide a new open trace, AMTrace (Alibaba Microarchitecture Trace) 1, which is profiled from 8,577 high-end physical hosts from Alibaba's datacenter by a hardware/software co-design monitoring method. AMTrace provides the microarchitectural metrics of 9.8 x 10(5) Linux containers with "Per-Container-Per-Logic CPU" granularity. Different from existing open traces, AMTrace provides a new perspective to analyze the microarchitectural resource characteristics of datacenter jobs. Based on AMTrace, we first reveal the uneven resource usage of jobs among multiple logic CPUs. Then, we analyze the impact of resource contention of CPU and memory bandwidth on job performance. Finally, we analyze the job performance under different CPU provisioning modes from microarchitecture perspective. These analyses lead to constructive insights for datacenter resource management and optimization. Furthermore, we discuss possible research opportunities on AMTrace and we believe that AMTrace will inspire more exciting research on microarchitecture and resource management.
What problem does this paper attempt to address?