LogGrep: Fast and Cheap Cloud Log Storage by Exploiting Both Static and Runtime Patterns.

Junyu Wei,Guangyan Zhang,Junchao Chen,Yang Wang,Weimin Zheng,Tingtao Sun,Jiesheng Wu,Jiangwei Jiang
DOI: https://doi.org/10.1145/3552326.3567484
2023-01-01
Abstract:In cloud systems, near-line logs are mainly used for debugging, which means they prefer a low query latency for a better user experience, and like any other logs, they also prefer a low overall cost including storage cost to store compressed logs and computation cost to compress logs and execute queries. This paper proposes LogGrep, the first log compression and query tool that structurizes and organizes log data properly in fine-grained units by exploiting both static and runtime patterns. It first parses logs into variable vectors by exploiting static patterns and then extracts runtime pattern(s) automatically within each variable vector with a novel extraction method. Based on these runtime patterns, LogGrep further decomposes the variable vectors into fine-grained units called "Capsules" and stamps each Capsule with a summary of its values. During the query process, LogGrep can avoid decompressing and scanning Capsules that cannot possibly match the keywords, with the help of the extracted runtime patterns and the Capsule stamps. We evaluate LogGrep on 21 types of logs from the production environment of Alibaba Cloud, and 16 types of logs from the public datasets. The results show that LogGrep can reduce query latency and overall cost by an order of magnitude compared to state-of-the-art works. Such results have confirmed that exploiting both static and runtime patterns to structurize logs can achieve fast and cheap cloud log storage.
What problem does this paper attempt to address?