Alibaba Hologres

Xiaowei Jiang,Yuejun Hu,Yu Xiang,Guangran Jiang,Xiaojun Jin,Chen Xia,Weihua Jiang,Jun Yu,Haitao Wang,Yuan Jiang,Jihong Ma,Li Su,Kai Zeng
DOI: https://doi.org/10.14778/3415478.3415550
IF: 2.5
2020-01-01
Proceedings of the VLDB Endowment
Abstract:In existing big data stacks, the processes of analytical processing and knowledge serving are usually separated in different systems. In Alibaba, we observed a new trend where these two processes are fused: knowledge serving incurs generation of new data, and these data are fed into the process of analytical processing which further fine tunes the knowledge base used in the serving process. Splitting this fused processing paradigm into separate systems incurs overhead such as extra data duplication, discrepant application development and expensive system maintenance. In this work, we propose Hologres, which is a cloud native service for hybrid serving and analytical processing (HSAP). Hologres decouples the computation and storage layers, allowing flexible scaling in each layer. Tables are partitioned into self-managed shards. Each shard processes its read and write requests concurrently independent of each other. Hologres leverages hybrid row/column storage to optimize operations such as point lookup, column scan and data ingestion used in HSAP. We propose Execution Context as a resource abstraction between system threads and user tasks. Execution contexts can be cooperatively scheduled with little context switching overhead. Queries are parallelized and mapped to execution contexts for concurrent execution. The scheduling framework enforces resource isolation among different queries and supports customizable schedule policy. We conducted experiments comparing Hologres with existing systems specifically designed for analytical processing and serving workloads. The results show that Hologres consistently outperforms other systems in both system throughput and end-to-end query latency.
What problem does this paper attempt to address?