The ALICE analysis train system

Markus Zimmermann
DOI: https://doi.org/10.48550/arXiv.1502.06381
2015-02-23
Abstract:In the ALICE experiment hundreds of users are analyzing big datasets on a Grid system. High throughput and short turn-around times are achieved by a centralized system called the LEGO trains. This system combines analysis from different users in so-called analysis trains which are then executed within the same Grid jobs thereby reducing the number of times the data needs to be read from the storage systems. The centralized trains improve the performance, the usability for users and the bookkeeping in comparison to single user analysis. The train system builds upon the already existing ALICE tools, i.e. the analysis framework as well as the Grid submission and monitoring infrastructure. The entry point to the train system is a web interface which is used to configure the analysis and the desired datasets as well as to test and submit the train. Several measures have been implemented to reduce the time a train needs to finish and to increase the CPU efficiency.
High Energy Physics - Experiment,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?