RefExpo: Unveiling Software Project Structures through Advanced Dependency Graph Extraction

Vahid Haratian,Pouria Derakhshanfar,Vladimir Kovalenko,Eray Tüzün
2024-07-03
Abstract:Assessing the dependency graph (DG) of a software project offers valuable insights for identifying its key components. Numerous studies have explored extracting DGs and leveraging them for various analyses, including security and bus factor calculations. However, there is a lack of user-friendly tools for DG extraction, and no comprehensive DG datasets from open-source projects are available. This study introduces RefExpo, an easy-to-use DG extraction tool supporting multiple languages like Java, Python, and JavaScript. Based on the IntelliJ plugin SDK, RefExpo ensures compatibility with various project structures and technology versions. We also provide a dataset of 20 Java and Python projects, with plans to expand upon request. To validate RefExpo we focused on Java and Python. Our tests showed RefExpo achieving 92% and 100% recall on micro test suites Judge and PyCG for Python and Java, respectively. In macro-level experiments, RefExpo outperformed existing tools by at least 31% and 7% in finding unique and shared results. You can access the source code of our tool from our replication package1. The installable version of RefExpo is available on the IntelliJ marketplace. Additionally, a short video describing its functionality can be viewed here: <a class="link-external link-https" href="https://youtu.be/eCnPUlj6YgA" rel="external noopener nofollow">this https URL</a>.
Software Engineering
What problem does this paper attempt to address?