Evaluation of TLB Prefetching Techniques

P. Sarda,Girish Motwani,Deepak Patil
Abstract:The performance of processors has increased dramatically over the past twenty years. Main memories, on the other hand, have increased in capacity, but without matching reductions in latencies. However, using memory hierarchy and other latency hiding techniques rapid increase in processor performance has been achieved. In current processors with paged virtual memory, the Memory Management Unit (MMU) has the unenviable task of translating application virtual addresses into physical addresses. Associated with increased data set size, is an increased pressure on the processor MMU and in particular on the Translation Lookaside Buffer (TLB). TLB is becoming the main component of critical path in processor performance. Increasingly, TLB misses are becoming a significant component of program execution. Hence, address translation using the Translation Lookaside Buffer is one of the most critical operations in determining the delivered performance of CPU. To achieve higher performance, it is essential to speed up the TLB miss handling as it is one of the most frequently executed kernel service. TLB miss handling has been shown to constitute as much as 40% of execution time. One approach to improve delivered performance of TLB is to preload/prefetch the TLB entries to hide some or all of the miss costs. In this work, we evaluate the performance of three TLB prefetching (also known as TLB preloading) techniques namely Sequential Prefetching, Recency-based TLB prefetching and Distance Prefetching using some of the SPEC CPU 2000 benchmarks.
Computer Science
What problem does this paper attempt to address?