Software Agents in Data and Workflow Management
T. Barrass,Y. Wu,I. Semeniouk,D. Bonacorsi,D. Newbold,L. Tuura,T. Wildish,C. Charlot,Nicola De Filippis,S. Metson,I. Fisk,J. Hernández,C. Grandi,A. Afaq,J. Rehn
DOI: https://doi.org/10.5170/CERN-2005-002.838
2004-11-01
Abstract:CMS currently uses a number of tools to transfer data which, taken together, form the basis of a heterogeneous datagrid. The range of tools used, and the directed, rather than optimized nature of CMS recent large scale data challenge required the creation of a simple infrastructure that allowed a range of tools to operate in a complementary way. The system created comprises a hierarchy of simple processes (named ‘agents’) that propagate files through a number of transfer states. File locations and some application metadata were stored in POOL file catalogues, with LCG LRC or MySQL back-ends. Agents were assigned limited responsibilities, and were restricted to communicating state in a well-defined, indirect fashion through a central transfer management database. In this way, the task of distributing data was easily divided between different groups for implementation. The prototype system was developed rapidly, and achieved the required sustained transfer rate of ~10 MBps, with O(10) files distributed to 6 sites from CERN. Experience with the system during the data challenge raised issues with underlying technology (MSS write/read, stability of the LRC, maintenance of file catalogues, synchronization of filespaces), all of which have been successfully identified and handled. The development of this prototype infrastructure allows us to plan the evolution of backbone CMS data distribution from a simple hierarchy to a more autonomous, scalable model drawing on emerging agent and grid technology. DATA DISTRIBUTION FOR CMS The Compact Muon Solenoid (CMS) experiment at the LHC will produce Petabytes of data a year [1]. This data is then to be distributed to multiple sites which form a hierarchical structure based on available resources: the detector is associated with a Tier 0 site; Tier 1 sites are typically large national computing centres; and Tier 2 sites are Institutes with a more restricted availability of resources and/or services. A core set of Tier 1 sites with large tape, disk and network resources will receive raw and reconstructed data to safeguard against data loss at CERN. Smaller sites, associated with certain analysis groups or Universities, will also subscribe to certain parts of the data. Sites at all levels will be involved in producing Monte Carlo data for comparison with detector data. At the Tier 0 the raw experiment data undergoes a process called reconstruction in which it is restructured to represent physics objects. This data will be grouped hierarchically by stream and dataset based on physics content, then further subdivided by finer granularity metadata. There are therefore three main use cases for distribution in CMS. The first can be described as a push with high priority, in which raw data is replicated to tape at Tier 1s. The second is a subscription pull, where a site subscribes to all data in a given set and data is transferred as it is produced. This use case corresponds to a site registering an interest in the data produced by an ongoing Monte Carlo simulation. The third is a random pull, where a site or individual physicist just wishes to replicate an extant dataset in a one-off transfer. Although these use cases are here discussed in terms of push and pull these can be slightly misleading descriptions. The key point is the effective handover of responsibility for replication between distribution components; for example, it is necessary to determine whether a replica has been created safely in a Tier 1 tape store before being able to delete it from a buffer at the source. This handover is enabled with well-defined handshakes or exchanges of state messages between distribution components. The conceptual basis of data distribution for CMS is then distribution through a hierarchy of sites, with smaller sites associating themselves to larger by subscribing to some subset of the data stored at the larger site. The management of this data poses two overall problems. The first problem is that sustained transfers at the 100+ MBps estimated for CMS alone are currently only approached by existing experiments. The second problem is one of managing the logisitics of subscription transfer based on metadata at granularities between high