Automatically Generating Descriptive Texts in Logging Statements: How Far Are We?

Xiaotong Liu,Tong Jia,Ying Li,Hao Yu,Yang Yue,Chuanjia Hou
DOI: https://doi.org/10.1007/978-3-030-64437-6_13
2020-01-01
Abstract:In most cases, logs are the only accurate information available for administrators to understand system behavior and diagnose failure root causes. However, due to the lack of well-defined logging guidance, it is challenging for developers to decide what to log, especially logging statements that contain descriptive texts and variables. In this paper, we explore automatically generation of descriptive texts in logging statements and evaluate the effectiveness of various automatic generation methods. We propose that to generate descriptive texts in logging statements can be transferred as a retrieval-based Q&A task. According to the roles of query and answer, we design two retrieval strategies including Code&Code and Code&Log. To measure the similarity between the query and answer, we utilize two types of retrieval algorithms including Information retrieval-based and neural networks-based algorithms. We conduct a systematic analysis of various retrieval algorithms under different retrieval strategies in terms of their effectiveness, and assess their accuracy using the automatic metrics and human evaluation during which 5 instructive findings are presented. We believe that these findings can provide potential implications for both researchers and practitioners for relevant research. Moreover, we construct and release a log text dataset containing over 138K valid log texts from 85 Java projects in Apache ecosystem for future logging statement analysis and generation.
What problem does this paper attempt to address?