Cache Design of Ssd-Based Search Engine Architectures: an Experimental Study

Jianguo Wang,Eric Lo,Man Lung Yiu,Jiancong Tong,Gang Wang,Xiaoguang Liu
DOI: https://doi.org/10.1145/2661629
IF: 4.657
2014-01-01
ACM Transactions on Information Systems
Abstract:Caching is an important optimization in search engine architectures. Existing caching techniques for search engine optimization are mostly biased towards the reduction of random accesses to disks, because random accesses are known to be much more expensive than sequential accesses in traditional magnetic hard disk drive (HDD). Recently, solid-state drive (SSD) has emerged as a new kind of secondary storage medium, and some search engines like Baidu have already used SSD to completely replace HDD in their infrastructure. One notable property of SSD is that its random access latency is comparable to its sequential access latency. Therefore, the use of SSDs to replace HDDs in a search engine infrastructure may void the cache management of existing search engines. In this article, we carry out a series of empirical experiments to study the impact of SSD on search engine cache management. Based on the results, we give insights to practitioners and researchers on how to adapt the infrastructure and caching policies for SSD-based search engines.
What problem does this paper attempt to address?