Per-Instruction Cycle Stacks Through Time-Proportional Event Analysis

Björn Gottschall,Lieven Eeckhout,Magnus Jahre
DOI: https://doi.org/10.1109/mm.2024.3407377
IF: 2.8212
2024-08-29
IEEE Micro
Abstract:Understanding what applications spend time on and why is critical for effective performance optimization. Unfortunately, current state-of-the-art performance analysis tools are generally unable to provide this information. The fundamental reason is that they lack time proportionality; i.e., in many cases, they do not attribute execution time to the instructions and performance events that the architecture is exposing the latency of. Time-proportional event analysis (TEA) creates per-instruction cycle stacks, which clearly and accurately explain what the application spends time on and why at the level of individual static instructions. TEA requires executing the application only once; it is accurate (with an average error of 2.1%); and its hardware implementation incurs negligible runtime, power, and area overheads of 1.1%, 0.1%, and 249 bits per core, respectively.
computer science, software engineering, hardware & architecture
What problem does this paper attempt to address?