A distributed data warehouse system for astroparticle physics
Minh-Duc Nguyen,Alexander Kryukov,Julia Dubenskaya,Elena Korosteleva,Stanislav Polyakov,Evgeny Postnikov,Igor Bychkov,Andrey Mikhailov,Alexey Shigarov,Oleg Fedorov,Yulia Kazarina,Dmitry Shipilov,Dmitry Zhurov
DOI: https://doi.org/10.48550/arXiv.1812.01906
2018-12-05
Abstract:A distributed data warehouse system is one of the actual issues in the field of astroparticle physics. Famous experiments, such as TAIGA, KASCADE-Grande, produce tens of terabytes of data measured by their instruments. It is critical to have a smart data warehouse system on-site to store the collected data for further distribution effectively. It is also vital to provide scientists with a handy and user-friendly interface to access the collected data with proper permissions not only on-site but also online. The latter case is handy when scientists need to combine data from different experiments for analysis. In this work, we describe an approach to implementing a distributed data warehouse system that allows scientists to acquire just the necessary data from different experiments via the Internet on demand. The implementation is based on CernVM-FS with additional components developed by us to search through the whole available data sets and deliver their subsets to users' computers.
Instrumentation and Methods for Astrophysics,Distributed, Parallel, and Cluster Computing