JSidentify: A Hybrid Framework for Detecting Plagiarism Among JavaScript Code in Online Mini Games
Qun Xia,Zhongzhu Zhou,Zhihao Li,Bin Xu,Wei Zou,Zishun Chen,Huafeng Ma,Gangqiang Liang,Haochuan Lu,Shiyu Guo,Ting Xiong,Yuetang Deng,Tao Xie
DOI: https://doi.org/10.1145/3377813.3381352
2020-01-01
Abstract:Online mini games are lightweight game apps, typically implemented in JavaScript (JS), that run inside another host mobile app (such as WeChat, Baidu, and Alipay). These mini games do not need to be downloaded or upgraded through an app store, making it possible for one host mobile app to perform the aggregated services of many apps. Hundreds of millions of users play tens of thousands of mini games, which make a great profit, and consequently are popular targets of plagiarism. In cases of plagiarism, deeply obfuscated code cloned from the original code often embodies malicious code segments and copyright infringements, posing great challenges for existing plagiarism detection tools. To address these challenges, in this paper, we design and implement JSidentify, a hybrid framework to detect plagiarism among online mini games. JSidentify includes three techniques based on different levels of code abstraction. JSidentify applies the included techniques in the constructed priority list one by one to reduce overall detection time. Our evaluation results show that JSidentify outperforms other existing related state-of-the-art approaches and achieves the best precision and recall with affordable detection time when detecting plagiarism among online mini games and clones among general JS programs. Our deployment experience of JSidentify also shows that JSidentify is indispensable in the daily operations of online mini games in WeChat.