Correct Wrong Path

Bhargav Reddy Godala,Sankara Prasad Ramesh,Krishnam Tibrewala,Chrysanthos Pepi,Gino Chacon,Svilen Kanev,Gilles A. Pokam,Daniel A. Jiménez,Paul V. Gratz,David I. August
2024-08-12
Abstract:Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster than execution-driven models, reducing the often hundreds of thousands of simulation hours needed to explore new micro-architectural ideas. Despite this strong benefit of trace-driven simulation, these often fail to adequately model the consequences of wrong path because obtaining them is nontrivial. Prior works consider either a positive or negative impact of wrong path but not both. Here, we examine wrong path execution in simulation results and design a set of infrastructure for enabling wrong-path execution in a trace driven simulator. Our analysis shows the wrong path affects structures on both the instruction and data sides extensively, resulting in performance variations ranging from $-3.05$\% to $20.9$\% when ignoring wrong path. To benefit the research community and enhance the accuracy of simulators, we opened our traces and tracing utility in the hopes that industry can provide wrong-path traces generated by their internal simulators, enabling academic simulation without exposing industry IP.
Hardware Architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the impact on cache state and performance due to the execution of wrong - path (WP) instructions when there is a branch misprediction in modern out - of - order (OOO) CPUs. Specifically: 1. **Limitations of existing simulators**: Traditional trace - driven simulators are fast, but they cannot accurately simulate the execution of wrong - path instructions because these simulators usually only contain instructions on the correct path. This leads to inaccuracies in performance estimation. 2. **Impact of the wrong path**: Wrong - path instructions may significantly change the cache state, thereby affecting the execution of subsequent correct - path instructions. This impact can be positive (for example, pre - loading useful data) or negative (for example, polluting the cache). However, existing research either only considers the positive impact or only considers the negative impact, without considering both simultaneously. 3. **Need to improve simulation accuracy**: In order to more accurately evaluate the effects of new micro - architecture designs, a method that can simultaneously simulate the positive and negative impacts of wrong - path instructions is required. To solve these problems, the author proposes a new method. By using an execution - driven simulator to generate trace data containing wrong - path instructions and applying it to the trace - driven simulator. This method not only improves the accuracy of the simulation but also maintains the speed advantage of the trace - driven simulator. Specific contributions include: - Providing detailed analysis and metrics to measure the impact of wrong - path instructions. - Designing a new trace format and tool for capturing and encoding wrong - path instructions. - Implementing the correct modeling of wrong - path instructions in the trace - driven simulator. - Open - sourcing a set of widely - used data - center and SPEC workload trace data to promote academic research. Through these improvements, the author hopes to significantly improve the accuracy of the trace - driven simulator without affecting the speed, thereby better supporting the exploration and evaluation of micro - architecture designs.