'R: Towards Detecting and Understanding Code-Document Violations in Rust
Wanrong Ouyang,Baojian Hua
DOI: https://doi.org/10.1109/ISSREW53611.2021.00063
2021-01-01
Abstract:Documentation and comments are important for any software project. Although documentation is not executed, it is useful for many purposes, such as code comprehension, reuse, and maintenance. As a project evolves, the code and documentation can easily grow out-of-sync, and inconsistencies are introduced, which can mislead developers and introduce new bugs in subsequent developments. Recent studies have shown it is promising to use natural language processing and machine learning to detect inconsistencies between code and documentation. However, it's challenging to apply existing techniques to detect code-document inconsistency in Rust programs, as Rustdoc supports advanced document features like document testing, which makes existing solutions inapplicable. This paper presents the first software tool prototype, 'R, to detect and understand code-document inconsistencies in Rust. To perform such analysis, 'R leverages static program analysis, not only on Rust source code, but also on document testing code, to detect inconsistency indicating either bugs or bad documentation. To evaluate the effectiveness of 'R, we applied it to 37 open source Rust projects from 9 domains, with a total of 6,192,251 lines of Rust source code (with 322,330 lines of comments). The results of the analysis give interesting insights, for example: the cryptocurrency domain has the highest documentation ratio (58.23%), documentation testing is rarely used (ratio 2.30% on average) in real-world Rust projects in all domains, etc. Based on these findings, we propose recommendations to guide the construction of better Rust documentation, better Rust documentation quality detection tools, and boarder adoption of the language.