Large-Scale-Exploit of GitHub Repository Metadata and Preventive Measures

David Knothe,Frederick Pietschmann
DOI: https://doi.org/10.48550/arXiv.1908.05354
2019-08-20
Abstract:When working with Git, a popular version-control system, email addresses are part of the metadata for each individual commit. When those commits are pushed to remote hosting services like GitHub, those email addresses become visible not only to fellow developers, but also to malicious actors aiming to exploit them. As a part of our research we created a tool that leverages the publicly available GitHub API to collect user data. Analysis of this data not only gives access to millions of email addresses in very little time, but is also powerful and dense enough to create targeted phishing attacks posing a great threat to all GitHub users and their private, potentially sensitive data. Even worse, existing countermeasures fail to effectively protect against such exploits. As a consequence and main conclusion of this paper, we suggest multiple preventive measures that should be implemented as soon as possible. We also consider it the duty of both companies like GitHub and well informed software engineers to inform fellow developers about the risk of exposing private email addresses in Git commits published publicly.
Cryptography and Security
What problem does this paper attempt to address?