Abstract:This paper summarizes the various storage options that we implemented for the CMSWEB cluster in Kubernetes infrastructure. All CMSWEB services require storage for logs, while some services also require storage for data. We also provide a feasibility analysis of various storage options and describe the pros/cons of each technique from the perspective of the CMSWEB cluster and its users. In the end, we also propose recommendations according to the service needs. The first option is the CephFS which can be mounted multiple times across various clusters and VMs and works very well with k8s. We use it both for data and the logs. The second option is the Cinder volume. It is the block storage that runs the filesystem on top of it. It can only be attached to one instance at a time. We use this option only for the data. The third option is S3 storage. It is object storage that offers a scalable storage service that can be used by applications compatible with the Amazon S3 protocol. It is used for the logs. For S3, we explored two mechanisms. For the first scenario, we consider fluentd that runs as a sidecar container in the service pods and sends logs to S3 bucket. For the second scenario, we considered filebeat that runs as a sidecar container in the service pod and scaps those logs to fluentd which runs as a daemonset in each node and sends those logs to S3 in the end. The fourth option is EOS. We configured EOS inside the pods of the CMSWEB services. The fifth option that we explored is to use dedicated VMs that have Ceph volume attached to them. In EOS and VM, the logs from the service pods are sent to EOS/VM using the rsync approach. The last option is to send service logs to Elasticsearch. It has been implemented using fluentd that runs as a daemonset in each node. In parallel to the sending logs to S3 fluentd also sends those logs to the Elasticsearch infrastructure at CERN.

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage System

The CMS monitoring infrastructure and applications

Research and Development of Comprehensive Monitoring and Management Platform for Substation

Evaluation and Implementation of Various Persistent Storage Options for CMSWEB Services in Kubernetes Infrastructure at CERN

A Cloud-based architecture for the Cherenkov Telescope Array observation simulations. Optimisation, design, and results

Web Based Monitoring in the CMS Experiment at CERN

JTangCMS: an Efficient Monitoring System for Cloud Platforms.

Monitoring the development of CFD applications on unstable HPC platforms

International Network Performance and Security Testing Based on Distributed Abyss Storage Cluster and Draft of Data Lake Framework

The archive solution for distributed workflow management agents of the CMS experiment at LHC

Separation is for Better Reunion: Data Lake Storage at Huawei

SimMon: A Toolkit for Simulating Monitoring Mechanism in Cloud Computing Environments.

Monitoring and Analytics at INFN Tier-1: the next step

From Facility to Application Sensor Data: Modular, Continuous and Holistic Monitoring with DCDB

AstroCloud, a Cyber-Infrastructure for Astronomy Research: Data Archiving and Quality Control

Data Management at Huawei: Recent Accomplishments and Future Challenges

Evaluation of a new visualization and analytics solution for slow control data for large scale experiments

HEP Benchmark Suite: Enhancing Efficiency and Sustainability in Worldwide LHC Computing Infrastructures

Zero+: Monitoring Large-Scale Cloud-Native Infrastructure Using One-Sided RDMA

Long-term field studies of a distributed network of sensors for environmental radiological monitoring

Upgrade and integration of the configuration and monitoring tools for the ATLAS Online farm