Abstract:In recent decades, data aggregation and correlation have emerged as a significant and challenging area of research for finding useful relationships between different features of streaming data. Similarly, complex event processing (CEP) has emerged as a vital tool for aggregating and finding patterns from streaming data such as video streaming, real‐time stock trades, sensor data and more. In this context, rule‐based classifier algorithms are widely employed to discover valuable patterns from streaming data and cause‐and‐effect relationships between events. However, streaming data is dynamic and complex in nature, and it is not feasible to update the rules manually by domain experts continuously. On the contrary, dynamic data sometimes fails to adopt rules, which leads to sub optimal CEP implementation. Rule‐based CEP systems come up with their own set of challenges, which primarily includes rules adaptability for streaming IoT data. In dynamic environment, shifting patterns in data streams can impact the performance of the CEP system, as well as the correlation between events, and finding useful patterns is very much challenging. In this article, our objective is to provide a comprehensive survey for CEP using rule‐based algorithms applied across various domain and applications. We present a broad literature review using a sequential expansion of methods used for CEP, which primarily include event producers, preprocessing of events, robust rule implementation, and decision support. Additionally, we present a detailed classification of approaches used for different applications. Finally, after applying the rigorous review, it was feasible to present the state‐of‐the‐art research, which focuses on designing robust rules, application specific insights, scalable event‐driven architectures and finding the domain's trends with significant challenges and future scope.

A Distributed Rule Engine for Streaming Big Data

A distributed architecture for rule engine to deal with big data

Distributed High-Dimension Matrix Operation Optimization on Spark

Research on Rule Matching Model Based on Spark

SparkRDF: Elastic Discreted RDF Graph Processing Engine with Distributed Memory

BigSR: an empirical study of real-time expressive RDF stream reasoning on modern Big Data platforms

Parallel Processing of Sensor Data in a Distributed Rules Engine Environment through Clustering and Data Flow Reconfiguration

Progressive online aggregation in a distributed stream system

Large Scale Semantic Rule-based Backward Chaining Reasoning on Spark

A scalable rule engine system for trigger-action application in large-scale IoT environment

Rule based complex event processing for IoT applications: Review, classification and challenges

A Novel Approach to Distributed Rule Matching and Multiple Firing Based on MapReduce

Large-Scale Real-Time Semantic Processing Framework for Internet of Things

A Distributed Real-Time Recommender System for Big Data Streams

Stream prediction using representative episode rules

A common interface for multi-rule-engine distributed systems

One SQL to Rule Them All

Re-Stream: Real-time and Energy-Efficient Resource Scheduling in Big Data Stream Computing Environments

Cichlid: Efficient Large Scale RDFS/OWL Reasoning with Spark

Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark