Abstract:<p class="a-plus-plus">In the modern cloud environment, considering the cost of hardware and software resources, applications are often co-located on a platform and share such resources. However, co-located execution and resource sharing bring memory access conflict, especially in the Last Level Cache (LLC). In this paper, a lightweight method is proposed for partition LLC named by Classification-and-Allocation (C&A). Specifically, Support Vector Machine (SVM) is used in the proposed method to classify applications into the triple classes based on the performance change characteristic (PCC), and the Bayesian Optimizer (BO) is leveraged to schedule LLC to guarantee applications with the same PCC sharing the same part of LLC. Since the near-optimal partition can be found efficiently by leveraging BO-based scheduling with a few sampling steps, C&A can handle unseen and versatile workloads with low overhead. We evaluate the proposed method in several workloads. Experimental results show that C&A can outperform the state-of-art method KPart (El-Sayed et al in Proceedings of 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) 104−117, <span class="a-plus-plus citation-ref citationid-c-r25">2018</span>) by 7.45<span class="a-plus-plus inline-equation id-i-eq1"><span class="a-plus-plus equation-source format-t-e-x"><span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.936ex" height="2.343ex" style="vertical-align: -0.338ex;" viewBox="0 -863.1 833.5 1008.6" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-25" x="0" y="0"></use></g></svg></span></span></span> and 22.50<span class="a-plus-plus inline-equation id-i-eq2"><span class="a-plus-plus equation-source format-t-e-x"><span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.936ex" height="2.343ex" style="vertical-align: -0.338ex;" viewBox="0 -863.1 833.5 1008.6" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-25" x="0" y="0"></use></g></svg></span></span></span> respectively in overall system throughput and fairness, and reduces 20.60<span class="a-plus-plus inline-equation id-i-eq3"><span class="a-plus-plus equation-source format-t-e-x"><span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.936ex" height="2.343ex" style="vertical-align: -0.338ex;" viewBox="0 -863.1 833.5 1008.6" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMAIN-25" x="0" y="0"></use></g></svg></span></span></span> allocation overhead.</p><svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMAIN-25" d="M465 605Q428 605 394 614T340 632T319 641Q332 608 332 548Q332 458 293 403T202 347Q145 347 101 402T56 548Q56 637 101 693T202 750Q241 750 272 719Q359 642 464 642Q580 642 650 732Q662 748 668 749Q670 750 673 750Q682 750 688 743T693 726Q178 -47 170 -52Q166 -56 160 -56Q147 -56 142 -45Q137 -36 142 -27Q143 -24 363 304Q469 462 525 546T581 630Q528 605 465 605ZM207 385Q235 385 263 427T292 548Q292 617 267 664T200 712Q193 712 186 709T167 698T147 668T134 615Q132 595 132 548V527Q132 436 165 403Q183 385 203 385H207ZM500 146Q500 234 544 290T647 347Q699 347 737 292T776 146T737 0T646 -56Q590 -56 545 0T500 146ZM651 -18Q679 -18 707 24T736 146Q736 215 711 262T644 309Q637 309 630 306T611 295T591 265T578 212Q577 200 577 146V124Q577 -18 647 -18H651Z"></path></defs></svg>

CPpf: a prefetch aware LLC partitioning approach

Cache Management with Partitioning-Aware Eviction and Thread-Aware Insertion/Promotion Policy

Improving Cache Partitioning Algorithms For Pseudo-Lru Policies

LPCA: Learned MRC Profiling based Cache Allocation for File Storage Systems

A Utility Based Cache Optimization Mechanism for Multi-Thread Workloads

Coupled Data Prefetch and Cache Partitioning Scheme for CPU-Accelerator System.

Access Adaptive and Thread-Aware Cache Partitioning in Multicore Systems

Prefetching Techniques for STT-RAM Based Last-Level Cache in CMP Systems

Effective Cache Apportioning for Performance Isolation Under Compiler Guidance

PCG: Mitigating Conflict-based Cache Side-channel Attacks with Prefetching

DCAPS: dynamic cache allocation with partial sharing

Combining Process-Based Cache Partitioning and Pollute Region Isolation to Improve Shared Last Level Cache Utilization on Multicore Systems

Machine-learning-based cache partition method in cloud environment

An Application-Oriented Cache Allocation and Prefetching Method for Long-Running Applications in Distributed Storage Systems

Performance Study of Partitioned Caches in Asymmetric Multi-Core Processors

Reducing last level cache pollution through OS-level software-controlled region-based partitioning.

Co-Optimizing Cache Partitioning and Multi-Core Task Scheduling: Exploit Cache Sensitivity or Not?

A Frequency Based Cache Replacement Algorithm with Partition of CMPs

Predictable Sharing of Last-level Cache Partitions for Multi-core Safety-critical Systems

CIACP: A Correlation- and Iteration- Aware Cache Partitioning Mechanism to Improve Performance of Multiple Coarse-Grained Reconfigurable Arrays.

Exploring Data Prefetching Mechanisms for Last Level Cache in Chip Multi-Processors