Copy Detection in Chinese Documents Using Ferret

Jun Peng Bao,Caroline Lyon,Peter C. R. Lane
DOI: https://doi.org/10.1007/s10579-007-9020-1
2007-01-01
Language Resources and Evaluation
Abstract:The Ferret copy detector has been used since 2001 to find plagiarism in large collections of students’ coursework in English. This article reports on extending its application to Chinese, with experiments on corpora of coursework collected from two Chinese universities. Our experiments show that Ferret can find both artificially constructed plagiarism and actually occurring, previously undetected plagiarism. We discuss issues of representation, focus on the effectiveness of a sub-symbolic approach, and show that Ferret does not need to find word boundaries first.
What problem does this paper attempt to address?