Detecting outdated code element references in software repository documentation

Wen Siang Tan,Markus Wagner,Christoph Treude
DOI: https://doi.org/10.1007/s10664-023-10397-6
IF: 3.762
2023-11-23
Empirical Software Engineering
Abstract:Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the source code takes considerable effort, especially for large codebases. To address this situation, we propose an approach that can automatically detect code element references that survive in the documentation after all source code instances have been deleted. In this work, we analysed over 3,000 GitHub projects and found that most projects contain at least one outdated code element reference at some point in their history. We submitted GitHub issues to real-world projects containing outdated references detected by our approach, some of which have already led to documentation fixes. As an initiative toward keeping documentation in software repositories up-to-date, we have made our implementation available for developers to scan their GitHub projects for outdated code element references.
computer science, software engineering
What problem does this paper attempt to address?