SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations

Andrea Tonon,Bora Caglayan,MingXue Wang,Peng Hu,Fei Shen,Puchao Zhang
DOI: https://doi.org/10.1109/SANER60148.2024.00048
2024-08-11
Abstract:In IT system operations, shell commands are common command line tools used by site reliability engineers (SREs) for daily tasks, such as system configuration, package deployment, and performance optimization. The efficiency in their execution has a crucial business impact since shell commands very often aim to execute critical operations, such as the resolution of system faults. However, many shell commands involve long parameters that make them hard to remember and type. Additionally, the experience and knowledge of SREs using these commands are almost always not preserved. In this work, we propose SHREC, a SRE behaviour knowledge graph model for shell command recommendations. We model the SRE shell behaviour knowledge as a knowledge graph and propose a strategy to directly extract such a knowledge from SRE historical shell operations. The knowledge graph is then used to provide shell command recommendations in real-time to improve the SRE operation efficiency. Our empirical study based on real shell commands executed in our company demonstrates that SHREC can improve the SRE operation efficiency, allowing to share and re-utilize the SRE knowledge.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the operational efficiency of site reliability engineers (SREs) when executing shell commands, and to share and reuse SRE knowledge. Specifically, the paper focuses on the following aspects: 1. **Memory and Input Problems of Complex Commands**: Many shell commands contain long parameters, which are difficult to remember and input, resulting in inefficient operations. 2. **Preservation and Reuse of Empirical Knowledge**: The experience and knowledge of SREs when using shell commands for analysis or troubleshooting are usually not preserved and cannot be reused by other SREs. 3. **Limitations of Existing Systems**: Existing shell recommendation systems mainly rely on the auto - completion function of historical commands, lack the modeling of SRE operation knowledge, and cannot provide sequence - based recommendations. To solve these problems, the paper proposes a new method - **SHREC** (SRE Behavior Knowledge Graph Model) to improve the operational efficiency of SREs in the following ways: - **Extract SRE Behavior Knowledge from Historical Data**: Through the parsing, processing, pattern mining and aggregation of SRE historical shell operation data, useful operation commands and their sequences are automatically extracted. - **Construct SRE Behavior Knowledge Graph**: Represent the extracted SRE behavior knowledge as a knowledge graph, which contains entities such as commands, file paths, users, intentions and their relationships. - **Recommendation System Based on Knowledge Graph**: Use the knowledge graph to provide real - time command and command sequence recommendations to help SREs perform tasks more efficiently and share their operational experiences. Through these methods, SHREC not only improves the operational efficiency of SREs, but also enables the preservation and reuse of SRE knowledge, thereby enhancing the overall operation and maintenance efficiency of the IT system. ### Formula Presentation The formulas involved in the description are as follows: - **Sequence Length**: \[ |s|=\ell \] where \( s = \langle i_{j1}, i_{j2},\ldots, i_{j\ell} \rangle \) is an ordered list composed of \(\ell\) elements. - **Subsequence Definition**: \[ a=\langle a_1, a_2,\ldots, a_{|a|} \rangle \text{ is a subsequence of } b = \langle b_1, b_2,\ldots, b_{|b|} \rangle \] if and only if there exist integers \( 1\leq r_1 < r_2 <\cdots < r_k\leq |b| \) such that \( a_1 = b_{r1}, a_2 = b_{r2},\ldots, a_k = b_{rk} \) and for each pair of consecutive elements \( a_j, a_{j + 1}\in a \), \( r_{j+1}-r_j\leq g \), where \( j\in\{1, |a|- 1\} \). - **Support**: \[ \text{supp}_D(s, g)=|\{\tau\in D : s\sqsubseteq_g\tau\}| \] that is, the number of transactions belonging to \( s \) in the data set \( D \). - **Frequency**: \[ f_D(s, g)=\frac{\text{supp}_D(s, g)}{|D|} \] Through these formulas, the paper ensures the effective mining and evaluation of sequence patterns, thus providing a solid foundation for subsequent command recommendations.