Design and Implementation of the MaxStream Federated Stream Processing Architecture
Irina Botan,Younggoo Cho,Roozbeh Derakhshan,Nihal Dindar,Laura Haas,Kihong Kim,Chulwon Lee,Girish Mundada,Ming-Chien Shan,Nesime Tatbul,Ying Yan,Beomjin Yun,Jin Zhang
DOI: https://doi.org/10.3929/ethz-a-006835778
2009-01-01
Abstract:Despite the availability of several commercial data stream processing engines (SPEs), it remains hard to develop and maintain streaming applications. A major difficulty is the lack of standards, and the wide (and changing) variety of application requirements. Consequently, existing SPEs vary widely in data and query models, APIs, functionality, and optimization capa- bilities. This has led to some organizations using multiple SPEs, based on their application needs. Furthermore, management of stored data and streaming data are still mostly separate concerns, although applications increasingly require integrated access to both. In the MaxStream project, our goal is to design and build a federated stream processing architecture that seamlessly integrates multiple autonomous and heterogeneous SPEs with traditional databases, and hence facilitates the incorporation of new functionality and requirements. In this paper, we describe the design and implementation of the MaxStream architecture, and demonstrate its feasibility and performance on two benchmarks: the Linear Road Stream Data Management Benchmark and the SAP Sales and Distribution Benchmark. I. INTRODUCTION