Abstracting the storage and retrieval of image data at the LSST

Tim Jenness,James F. Bosch,Pim Schellart,Kian-Ta Lim,Andrei Salnikov,Michelle Gower
DOI: https://doi.org/10.48550/arXiv.1812.08085
2018-12-20
Abstract:Writing generic data processing pipelines requires that the algorithmic code does not ever have to know about data formats of files, or the locations of those files. At LSST we have a software system known as "the Data Butler," that abstracts these details from the software developer. Scientists can specify the dataset they want in terms they understand, such as filter, observation identifier, date of observation, and instrument name, and the Butler translates that to one or more files which are read and returned to them as a single Python object. Conversely, once they have created a new dataset they can give it back to the Butler, with a label describing its new status, and the Butler can write it in whatever format it has been configured to use. All configuration is in YAML and supports standard defaults whilst allowing overrides.
Instrumentation and Methods for Astrophysics
What problem does this paper attempt to address?