Raven: Benchmarking Monetary Expense and Query Efficiency of OLAP Engines on the Cloud

Tongyu Wu,Rong Gu,Yang Li,Hongbin Ma,Yi Chen,Ying Zhu,Xiaoxiang Yu,Tengting Xu,Yihua Huang
DOI: https://doi.org/10.1007/978-3-031-30678-5_45
2023-01-01
Abstract:Nowadays, it is prevalent to build OLAP services on cloud platforms. Cloud OLAP adopters are eager to understand and characterize the performance of OLAP engines on the cloud. However, traditional OLAP benchmarks are usually designed for on-premise environments. When evaluating cloud OLAP engines, they have limitations on cloud environment adaption and cloud scenario benchmark execution. To address these issues, this paper proposes Raven, a cloud-oriented OLAP benchmark with flexible system architecture and diversified workloads. Raven supports cloud service deployment and various cloud OLAP engine integration. In addition, to simulate complex cloud query scenarios, we design a group of timeline-based and service-oriented workloads. We implement Raven on the Amazon AWS cloud platform and use it to evaluate typical types of widely-used OLAP engines, including Presto, SparkSQL, Kylin, and Athena. Experimental results show that Raven can effectively benchmark diversified OLAP engines. Besides, Raven can benchmark various configuration settings of an identical OLAP engine. We also explore an OLAP case study on the cloud using Raven.
What problem does this paper attempt to address?