What Are the Dominant Projects in the GitHub Python Ecosystem?

Wanwangying Ma,Lin Chen,Yuming Zhou,Baowen Xu
DOI: https://doi.org/10.1109/tsa.2016.23
2016-01-01
Abstract:GitHub, a popular social-software-development platform, has fostered a variety of software ecosystems where projects depend on one another and co-evolve together. The projects located in the central hub of the ecosystem are supposed to be important and could affect a number of other projects. However, few researches have investigated the dominant projects in a software ecosystem. In this study, we aim to identify the most influential projects in the GitHub Python ecosystem. We first construct the GitHub Python ecosystem with 19797 projects by identifying their inter-dependencies. Then, we calculate the four kinds of centrality metrics to measure the centrality and influence of each project in the ecosystem. Finally, we evaluate the project's popularity using GitHub social methods and compare the consistency of the two measurements. Our results indicate that 1) the most influential projects are mostly custom libraries; 2) only a small number of projects have large values of the centrality metrics; 3) the dominant projects are not always popular among the GitHub users. Our results help the researchers and practitioners gain a better understanding of the GitHub Python ecosystem.
What problem does this paper attempt to address?