Impact of Data Locality on Garbage Collection in SSDs: A General Analytical Study

Yongkun Li,Patrick P. C. Lee,John C. S. Lui,Yinlong Xu
DOI: https://doi.org/10.1145/2668930.2688036
2015-01-01
Abstract:Solid-state drives (SSDs) necessitate garbage collection (GC) to erase data blocks and reclaim the space of invalidated data, and GC inevitably introduces additional writes due to data relocation. The performance of GC, which is quantified by cleaning cost or write amplification, is critical to the overall performance of SSDs. However, characterizing GC performance is complicated by the general implementations of GC algorithms and the complex data locality characteristics of real-world workloads. This paper presents a general analytical study to characterize the performance impact of data locality on a general family of GC algorithms. We develop probabilistic models to address two fundamental issues: (1) What is the impact of data locality on the performance of locality-oblivious GC? (2) How can data locality be leveraged to improve the performance in locality-aware GC? We further conduct extensive trace-driven simulations on real-world workloads to validate the findings of our models.
What problem does this paper attempt to address?