The Impact of Solid State Drive on Search Engine Cache Management

Jianguo Wang,Eric Lo,Man Lung Yiu,Jiancong Tong,Gang Wang,Xiaoguang Liu
DOI: https://doi.org/10.1145/2484028.2484046
2013-01-01
Abstract:Caching is an important optimization in search engine architectures. Existing caching techniques for search engine optimization are mostly biased towards the reduction of random accesses to disks, because random accesses are known to be much more expensive than sequential accesses in traditional magnetic hard disk drive (HDD). Recently, solid state drive (SSD) has emerged as a new kind of secondary storage medium, and some search engines like Baidu have already used SSD to completely replace HDD in their infrastructure. One notable property of SSD is that its random access latency is comparable to its sequential access latency. Therefore, the use of SSDs to replace HDDs in a search engine infrastructure may void the cache management of existing search engines. In this paper, we carry out a series of empirical experiments to study the impact of SSD on search engine cache management. The results give insights to practitioners and researchers on how to adapt the infrastructure and how to redesign the caching policies for SSD-based search engines.
What problem does this paper attempt to address?