Identifying Features in Forks

Shurui Zhou,Stefan Stanciulescu,Olaf Lessenich,Yingfei Xiong,Andrzej Wasowski,Christian Kastner
DOI: https://doi.org/10.1145/3180155.3180205
2018-01-01
Abstract:Fork-based development has been widely used both in open source communities and in industry, because it gives developers flexibility to modify their own fork without affecting others. Unfortunately, this mechanism has downsides: When the number of forks becomes large, it is difficult for developers to get or maintain an overview of activities in the forks. Current tools provide little help. We introduce INFOX, an approach to automatically identify non-merged features in forks and to generate an overview of active forks in a project. The approach clusters cohesive code fragments using code and network-analysis techniques and uses information-retrieval techniques to label clusters with keywords. The clustering is effective, with 90% accuracy on a set of known features. In addition, a human-subject evaluation shows that INFOX can provide actionable insight for developers of forks.
What problem does this paper attempt to address?