Exploring Real-Time Data Processing Using Big Data Frameworks

Sampath Kini K
DOI: https://doi.org/10.52783/cana.v31.1561
2024-09-07
Abstract:Big data frameworks that weaken the throughput of data processing, allowing for real-time data processing like Apache Spark, Kafka, and Flink are other developments. Regarding quick decisions by each measurement, the scalability, fault tolerance, and latency of three architectures Here each stream processing, lambda, and Kappa have been further studied and measured to approach a conclusion. Based on a methodical survey of literature, performance laws, and case studies, all three frameworks and architectures pros and cons measure us, which can then be used for separate operations use situations. For example, our studies have shown that the smudge feels natural for micro-batch situations, and for high-throughput, Kafka, and actual stream processing, Flink feels natural due to complicated event time handling. However, the findings from this research show a relevant effect on the system architecture for companies that want to implement real-time data. Future research needs to study the possibility of edge computing to reduce reaction time and data processing bottlenecks, as well as the possibility of connecting that with machine knowledge opportunities to better understand predictive analysis. This illustrates the research that practitioners and researchers have to do to make real-time data processing part of their operations
What problem does this paper attempt to address?