Revisiting the Challenges and Opportunities in Software Plagiarism Detection

Xi Xu,Ming Fan,Ang Jia,Yin Wang,Zheng Yan,Qinghua Zheng,Ting Liu
DOI: https://doi.org/10.1109/saner48275.2020.9054847
2020-01-01
Abstract:Software plagiarism seriously impedes the healthy development of open source software. To fight against code obfuscation and inherent non-determinism of thread scheduling applied against software plagiarism detection, we proposed a new dynamic birthmark called DYnamic Key Instruction Sequence (DYKIS) and a framework called Thread-oblivious dynamic Birthmark (TOB) for the purpose of reviving the existing birthmarks and a thread-aware dynamic birthmark called Thread-related System call Birthmark (TreSB). Though many approaches have been proposed for software plagiarism detection, they are still limited to satisfy the following highly desired requirements: the applicability to handle binary, the capability to detect partial plagiarism, the resiliency to code obfuscation, the interpretability on detection results, and the scalability to process large-scale software. In this position paper, we discuss and outline the research opportunities and challenges in the field of software plagiarism detection in order to stimulate brilliant innovations and direct our future research efforts.
What problem does this paper attempt to address?