Abstract:The creation of monitoring clusters based on cloud computing technologies is a promising direction for the development of systems for continuous monitoring of objects for various purposes in the web space. Hadoop web-programming environment is the technological basis for the development of algorithmic and software solutions for the synthesis of monitoring clusters, including information security and information counteraction systems. The International Telecommunication Union’ (ITU) recommendations Y. 3510 present the requirements for cloud infrastructure that require monitoring the performance of deployed applications based on the collection of real-world statistics. Often, computing resources of monitoring clusters of cloud data centers are allocated for continuous parallel processing of high-speed streaming data, which imposes new requirements to monitoring technologies, necessitating the creation and research of new models of parallel computing. The need to use service monitoring plays an important role in the cloud computing industry, especially for SLA/QoS assessment, as the application or service may experience problems even if the virtual machines on which the work is taking place appear to be operational. This requires to study the methodological possibilities of organization to study of parallel processing high-speed streaming services with the processing of huge amounts of bit data, and, simultaneously, to estimate the necessary computational resource. In the conditions of high dynamics of changes in the bit rate of information generation from the source, a model of the bit rate of Discretized Stream (DStream) formation is proposed, which has a common application. Based on the poly-burst nature of the bit rate model, a model of group content traffic of any sources of different services processed in the cloud cluster was created. The obtained results made it possible to develop mathematical models of parallel DStreams from sources processed in a cloud cluster via Hadoop technology using the micro-batch architecture of the Spark Streaming module. These models take into account the flow of requests for maintenance from sources of different services, on the one hand, and, on the other hand, the needs of services in bit rate, taking into account the multichannel traffic of sources of various services. At the same time, analytical relations are obtained to calculate the required performance of the Hadoop cluster at a given value of the probability of batch loss.

The design of a streaming analytical workflow for processing massive transit feeds

Combining edge and cloud computing for mobility analytics

Progressive online aggregation in a distributed stream system

A Massive Sensor Data Streams Multi-dimensional Analysis Strategy Using Progressive Logarithmic Tilted Time Frame for Cloud-Based Monitoring Application.

Efficient Processing of Continuous Skyline Query over Smarter Traffic Data Stream for Cloud Computing

Transformation-based Streaming Workflow Allocation on Geo-Distributed Datacenters for Streaming Big Data Processing

Htme: A Data Streams Processing Strategy Based On Hoeffding Tree In Mapreduce Environment

Efficient Data Management for Intelligent Urban Mobility Systems

Moving big data to the cloud

Efficient Finer-Grained Incremental Processing with MapReduce for Big Data

Cloudets: Cloud-based Cognition for Large Streaming Data

Moving Big Data to The Cloud: An Online Cost-Minimizing Approach

Processing streams in a monitoring cloud cluster

A Data Streams Analysis Strategy Based on Hadoop Scheduling Optimization for Smart Grid Application.

Application of Batch and Stream Collaborative Computing in Urban Traffic Data Processing.

The Modeling Of Big Traffic Data Processing Based On Cloud Computing

Cost-Aware Streaming Workflow Allocation on Geo-Distributed Data Centers.

Streamlining trajectory map-matching: a framework leveraging spark and GPU-based stream processing

Implementing an Edge-Fog-Cloud architecture for stream data management

Coping With Heterogeneous Video Contributors and Viewers in Crowdsourced Live Streaming: A Cloud-Based Approach

Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark