An Intelligent Approach for Semantic Plagiarism Detection in Scientific Papers

Ayoub Ali M. Saeed,A. Taqa
DOI: https://doi.org/10.1109/ICCITM56309.2022.10031641
2022-08-31
Abstract:Plagiarism is the attempt to steal someone else’s idea and portray it as your own, which is both unlawful and unethical. Plagiarism detection is described as the automatic detection of written items that have been reused. Modern Plagiarism detection software often examines text documents using a text representation and text similarity approaches to look for words that appear in both suspicious and source documents. The widespread use of the internet and simple access to textual content has increased the demand for automated plagiarism detection. This paper proposes an intelligent approach to detect semantic plagiarism in scientific papers. A corpus for storing the text of source scientific papers in order to compare the suspicious documents with it and detect plagiarism have been created. The documents are clustered into numerous groups using the Mini-Batch K-Means clustering algorithm, then get the documents in a specified category. The Universal Sentence Encoder (USE) intelligent model, was employed to extract textual features. The report of plagiarism percentage will be displayed as a web page and can be downloaded as a pdf file. A decision is made to accept or reject the suspicious scientific paper according to a threshold value. To evaluate the proposed system, it is compared with Turnitin platform for Plagiarism Detection (PD), the result of comparison indicates that the proposed system is reliable. The findings promote and can reduce the percentage of plagiarism after deleting some unnecessary parts of the suspicious document.
Computer Science
What problem does this paper attempt to address?