Assessing Code Authorship: The Case of the Linux Kernel

Guilherme Avelino,Leonardo Passos,Andre Hora,Marco Tulio Valente
DOI: https://doi.org/10.1007/978-3-319-57735-7_15
2017-03-09
Abstract:Code authorship is a key information in large-scale open source systems. Among others, it allows maintainers to assess division of work and identify key collaborators. Interestingly, open-source communities lack guidelines on how to manage authorship. This could be mitigated by setting to build an empirical body of knowledge on how authorship-related measures evolve in successful open-source communities. Towards that direction, we perform a case study on the Linux kernel. Our results show that: (a) only a small portion of developers (26 %) makes significant contributions to the code base; (b) the distribution of the number of files per author is highly skewed --- a small group of top authors (3 %) is responsible for hundreds of files, while most authors (75 %) are responsible for at most 11 files; (c) most authors (62 %) have a specialist profile; (d) authors with a high number of co-authorship connections tend to collaborate with others with less connections.
Software Engineering,Operating Systems,Social and Information Networks
What problem does this paper attempt to address?