SmartScan: Efficient metadata crawl for storage management metadata querying in large file systems

Likun Liu,Lianghong Xu,Yongwei Wu,Guangwen Yang,Gregory R Ganger
2010-01-01
Abstract:SmartScan is a metadata crawl tool that exploits patterns in metadata changes to significantly improve the efficiency of support for file-system-wide metadata querying, which is an important tool for administrators. In most environments, support for metadata queries is provided by databases populated and refreshed by calling stat () on every file in the file system. For large file systems, where such storage management tools are most needed, it can take many hours to complete each scan, even if only a small percentage of the files have changed. To address this issue, we identify patterns in metadata changes that can be exploited to restrict scanning to the small subsets of directories that have recently had modified files or that have high variation in file change times. Experiments with using SmartScan on production file systems show that exploiting metadata change patterns can reduce the time needed to refresh the metadata database by one or two orders with minimal loss of freshness.Acknowledgements: We thank Michael Stroucken, Raja Sambasivan, and Michelle Mazurek for their comments nd suggestions on this paper; Mitch Franzos, Wusheng Zhang, Qiaoke Zhang and Ze Chen for helping test our scripts; and our colleagues in Parallel Data Laboratory for their early feedback and discussions on this work.
What problem does this paper attempt to address?