Patterns of folder use and project popularity: a case study of github repositories.

Jiaxin Zhu,Minghui Zhou,Audris Mockus
DOI: https://doi.org/10.1145/2652524.2652564
2014-01-01
Abstract:ABSTRACTContext: Every software development project uses folders to organize software artifacts. Goal: We would like to understand how folders are used and what ramifications different uses may have. Method: In this paper we study the frequency of folders used by 140k Github projects and use regression analysis to model how folder use is related to project popularity, i.e., the extent of forking. Results: We find that the standard folders, such as document, testing, and examples, are not only among the most frequently used, but their presence in a project is associated with increased chances that a project's code will be forked (i.e., used by others) and an increased number of forks. Conclusions: This preliminary study of folder use suggests opportunities to quantify (and improve) file organization practices based on folder use patterns of large collections of repositories.
What problem does this paper attempt to address?