Individual context-free online community health indicators fail to identify open source software sustainability

Yo Yehudi,Carole Goble,Caroline Jay
2024-05-09
Abstract:The global value of open source software is estimated to be in the billions or trillions worldwide1, but despite this, it is often under-resourced and subject to high-impact security vulnerabilities and stability failures2,3. In order to investigate factors contributing to open source community longevity, we monitored thirty-eight open source projects over the period of a year, focusing primarily, but not exclusively, on open science-related online code-oriented communities. We measured performance indicators, using both subjective and qualitative measures (participant surveys), as well as using computational scripts to retrieve and analyse indicators associated with these projects' online source control codebases. None of the projects were abandoned during this period, and only one project entered a planned shutdown. Project ages spanned from under one year to over forty years old at the start of the study, and results were highly heterogeneous, showing little commonality across documentation, mean response times for issues and code contributions, and available funding/staffing resources. Whilst source code-based indicators were able to offer some insights into project activity, we observed that similar indicators across different projects often had very different meanings when context was taken into account. We conclude that the individual context-free metrics we studied were not sufficient or essential for project longevity and sustainability, and might even become detrimental if used to support high-stakes decision making. When attempting to understand an online open community's longer-term sustainability, we recommend that researchers avoid cross-project quantitative comparisons, and advise instead that they use single-project-level assessments which combine quantitative measures with contextualising qualitative data.
Software Engineering,Computers and Society
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **Can individual context - independent online community health indicators accurately identify the long - term sustainability (longevity) of open - source projects?** Specifically, the researchers are concerned with whether these indicators can effectively predict the continuity and stability of open - source projects within 12 months. ### Research Background and Problem Description Open - source software (OSS) has huge economic value globally, with an estimated value of billions or even trillions of dollars. However, despite this, many open - source projects still face problems such as insufficient resources, high - impact security vulnerabilities, and stability issues. To explore the factors affecting the long - term survival of open - source communities, the researchers monitored 38 open - source projects for up to one year, focusing on but not limited to online code communities related to open science. ### Research Questions The main research questions are: - **Can individual context - independent online community project health indicators accurately identify the long - term viability of projects during the 12 - month period?** ### Specific Problem Decomposition 1. **Effectiveness of Health Indicators**: Can existing health indicators effectively predict the long - term survival of projects? 2. **Impact of Context**: Different projects may have different meanings under the same health indicators, so the impact of context needs to be considered. 3. **Limitations of Cross - Project Comparison**: Is cross - project quantitative comparison helpful for understanding the long - term sustainability of a single project? ### Conclusions The study found that individual context - independent indicators are not sufficient or necessary for the long - term survival and sustainability of projects, and may even become harmful when used to support high - risk decisions. Therefore, the study recommends that researchers avoid cross - project quantitative comparison and instead use single - project evaluation methods that combine quantitative measures and qualitative data to better understand the long - term sustainability of projects. In this way, the researchers hope to provide more effective tools and methods for future research and practice to ensure the health and long - term sustainable development of open - source projects.