Towards Using Source Code Repositories to Identify Software Supply Chain Attacks

Duc Ly Vu,Ivan Pashchenko,Fabio Massacci,Henrik Plate,Antonino Sabetta
DOI: https://doi.org/10.1145/3372297.3420015
2020-10-30
Abstract:Increasing popularity of third-party package repositories, like NPM, PyPI, or RubyGems, makes them an attractive target for software supply chain attacks. By injecting malicious code into legitimate packages, attackers were known to gain more than 100,000 downloads of compromised packages. Current approaches for identifying malicious payloads are resource demanding. Therefore, they might not be applicable for the on-the-fly detection of suspicious artifacts being uploaded to the package repository. In this respect, we propose to use source code repositories (e.g., those in Github) for detecting injections into the distributed artifacts of a package. Our preliminary evaluation demonstrates that the proposed approach captures known attacks when malicious code was injected into PyPI packages. The analysis of the 2666 software artifacts (from all versions of the top ten most downloaded Python packages in PyPI) suggests that the technique is suitable for lightweight analysis of real-world packages.
What problem does this paper attempt to address?