Baler -- Machine Learning Based Compression of Scientific Data

Fritjof Bengtsson,Caterina Doglioni,Per Alexander Ekman,Axel Gallén,Pratik Jawahar,Alma Orucevic-Alagic,Marta Camps Santasmasas,Nicola Skidmore,Oliver Woolland
2024-02-16
Abstract:Storing and sharing increasingly large datasets is a challenge across scientific research and industry. In this paper, we document the development and applications of Baler - a Machine Learning based data compression tool for use across scientific disciplines and industry. Here, we present Baler's performance for the compression of High Energy Physics (HEP) data, as well as its application to Computational Fluid Dynamics (CFD) toy data as a proof-of-principle. We also present suggestions for cross-disciplinary guidelines to enable feasibility studies for machine learning based compression for scientific data.
Computational Physics,High Energy Physics - Experiment
What problem does this paper attempt to address?