Plagiarism detection for multithreaded software based on thread-aware software birthmarks.

Zhenzhou Tian,Qinghua Zheng,Ting Liu,Ming Fan,Xiaodong Zhang,Zijiang Yang
DOI: https://doi.org/10.1145/2597008.2597143
2014-01-01
Abstract: The availability of inexpensive multicore hardware presents a turning point in software development. In order to benefit from the continued exponential throughput advances in new processors, the software applications must be multithreaded programs. As multithreaded programs become increasingly popular, plagiarism of multithreaded programs starts to plague the software industry. Although there has been tremendous progress on software plagiarism detection technology, existing dynamic approaches remain optimized for sequential programs and cannot be applied to multithreaded programs without significant redesign. This paper fills the gap by presenting two dynamic birthmark based approaches. The first approach extracts key instructions while the second approach extracts system calls. Both approaches consider the effect of thread scheduling on computing software birthmarks. We have implemented a prototype based on the Pin instrumentation framework. Our empirical study shows that the proposed approaches can effectively detect plagiarism of multithread programs and exhibit strong resilience to various semantic-preserving code obfuscations.
What problem does this paper attempt to address?