Agile-Ant: Self-Managing Distributed Cache Management for Cost Optimization of Big Data Applications

Hani Al-Sayeh,Muhammad Attahir Jibril,Kai-Uwe Sattler
DOI: https://doi.org/10.14778/3681954.3681990
IF: 2.5
2024-07-01
Proceedings of the VLDB Endowment
Abstract:Distributed in-memory processing frameworks accelerate application runs by caching important datasets in memory. Allocating a suitable cluster configuration for caching these datasets plays a crucial role in achieving minimal cost. We present Agile-ant, a self-managing framework that identifies important datasets and scales out the cluster memory to cache them on the fly without any human interaction, without any prior knowledge of the application, the characteristics of the input data, the specification of the computing resources and their utilization by multiple-tenants. We evaluate Agile-ant on various real-world applications. Compared with our baseline, Agile-ant reduces execution cost by 78.3% on average and provides better performance than the related work.
computer science, information systems, theory & methods
What problem does this paper attempt to address?