BigDataBench:An Open-source Big Data Benchmark Suite
Jian-Feng ZHAN,Wan-Ling GAO,Lei WANG,Jing-Wei LI,Kai WEI,Chun-Jie LUO,Rui HAN,Xin-Hui TIAN,Chun-Yu JIANG
DOI: https://doi.org/10.11897/SP.J.1016.2016.00196
2016-01-01
Chinese Journal of Computers
Abstract:Booming big data sparks tremendous outpouring of interest in storing and processing these data,and consequently a variety of big data systems emerge,giving rise to great pressure on big data benchmarking.However,complexity and diversity of big data raise great challenges in big data benchmarking.Most of the related benchmark efforts either target at specific application domains and software stacks,or choose workloads subjectively according to so-called popularity, thus fail to cover the diversity and complexity of big data.In this paper,we discuss the requirements for big data benchmarking and present our open source big data benchmark suite—BigDataBench, which is a multi-discipline research and engineering effort,i.e.system,architecture,and data management.BigDataBench adopts an iterative and incremental methodology,not only covering five representative application domains,but also containing diverse data models and workload types.Currently,it includes 14 real-world data sets,scalable data generation tools for 3 kinds of data types,and 33 workloads implemented using competitive technologies.BigDataBench has been used both in academia and industry,with typical use cases of workload characterization, architecture design and system optimization.Based on BigDataBench,Chinese Academy of Information and Communications releases China’s first industry-standard big data benchmark suite together with ICT,CAS,Huawei and other well-known companies and research institutions.