An Empirical Study On Implementing Highly Reliable Stream Computing Systems With Private Cloud

Yaxiao Liu,Weidong Liu,Jiaxing Song,Huan He
DOI: https://doi.org/10.1016/j.adhoc.2015.07.009
IF: 4.816
2015-01-01
Ad Hoc Networks
Abstract:Stream computing systems are designed for high frequency data. Such systems can deal with billions of transactions per day in real cases. Cloud technology can support distributed stream computing systems by its elastic and fault tolerant capabilities. In a real deployment environment, such as the pre-treatment system in Chinese top banks, the reliability based on user experience is key metrics for performance. Although many significant works have been proposed in the literature, they have some limitations such as less of architectural focus or difficult to implement in complex projects. This paper describes the reliability issue which is caused by the service downgrade in cloud. We use novel reliability analysis techniques, queuing theory, and software rejuvenation management techniques to build a framework for supporting stream data with low latency and fault tolerance. A real streaming system from a top bank is studied to provide the supporting data. Operational parameters such as rejuvenation window and time-out parameter are identified as key parameters for the design of a distributed stream processing system. An algorithm for reliability optimization, monitoring and forecast is also introduced. The paper also compares the improved result with original issues, which saved millions of money and reputations. (C) 2015 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?