Document-structure-based copy detection algorithm

Bo JIN,Yan-jun SHI,Hong-fei TENG
DOI: https://doi.org/10.3321/j.issn:1000-8608.2007.01.024
2007-01-01
Abstract:Research on copy detection of academic papers is important in both intellectual property protection and academic plagiarism prevention. Nowadays, researchers mainly use digital fingerprinting technique and keyphrase matching technique on copy detection. To overcome the difficulty of Chinese copy detection, a set of document-structure-based algorithm for identifying Chinese plagiarized papers is presented, and mathematical models on it are given. The plagiarism identification (including full-plagiarism, part-plagiarism and pieced-plagiarism) is realized with the help of document-structure analysis, fingerprinting and word-frequency techniques. Lastly, comparing with two typical identification methods, the effectiveness of accurate paper copy detection of the proposed algorithm is demonstrated.
What problem does this paper attempt to address?