Accelerating multi-tier storage cache simulations using knee detection
Tyler Estro,Mário Antunes,Pranav Bhandari,Anshul Gandhi,Geoff Kuenning,Yifei Liu,Carl Waldspurger,Avani Wildani,Erez Zadok
DOI: https://doi.org/10.1016/j.peva.2024.102410
IF: 2.205
2024-02-29
Performance Evaluation
Abstract:Storage cache hierarchies include diverse topologies, assorted parameters and policies, and devices with varied performance characteristics. Simulation enables efficient exploration of their configuration space while avoiding expensive physical experiments. Miss Ratio Curves (MRCs) efficiently characterize the performance of a cache over a range of cache sizes, revealing "key points" for cache simulation, such as knees in the curve that immediately follow sharp cliffs. Unfortunately, there are no automated techniques for efficiently finding key points in MRCs, and the cross-application of existing knee-detection algorithms yields inaccurate results. We present a multi-stage framework that identifies key points in any MRC, for both stack-based ( e.g. , LRU) and more sophisticated eviction algorithms ( e.g. , ARC). Our approach quickly locates candidates using efficient hash-based sampling, curve simplification, knee detection, and novel post-processing filters. We introduce Z-Method , a new multi-knee detection algorithm that employs statistical outlier detection to choose promising points robustly and efficiently. We evaluated our framework against seven other knee-detection algorithms, identifying key points in multi-tier MRCs with both ARC and LRU policies for 106 diverse real-world workloads. Compared to naïve approaches, our framework reduced the total number of points needed to accurately identify the best two-tier cache hierarchies by an average factor of approximately 5.5× for ARC and 7.7× for LRU. We also show how our framework can be used to seed the initial population for evolutionary algorithms. We ran 32,616 experiments requiring over three million cache simulations, on 151 samples, from three datasets, using a diverse set of population initialization techniques, evolutionary algorithms, knee-detection algorithms, cache replacement algorithms, and stopping criteria. Our results showed an overall acceleration rate of 34% across all configurations.
computer science, theory & methods, hardware & architecture