DOMAIN-SPECIFIC CROSS-LINGUAL URDU TO ENGLISH (CLUE) PLAGIARISM DETECTION

Authors

  • F. Shahzad Department of Computer Science, Pakistan Institute of Engineering and Technology, 60000 Multan *Department of Information Technology, Bahauddin Zakariya University, 60000 Multan **School of Engineering and Computer Science, Victoria University of Wellington, New Zealand

DOI:

https://doi.org/10.57041/pjs.v70i2.164

Keywords:

Cross-lingual Plagiarism, Plagiarism detection, Text similarity analysis, n-Grams, Translational tools

Abstract

Plagiarism is an act of copying someone’s text without reference. Cross-lingual plagiarism detection (CLPD) deals with discovering and retrieving of the copied words and sentences in a bilingual scenario. There have been various attempts to detect the cross-lingual plagiarism in settings like English-to-German, English-to-Spanish, and Arabic-to-English. However, no system or framework is available for Urdu-English CLDP. This paper presents a new framework for detection of Urdu-English plagiarism using Translate plus Mono Lingual Analysis technique. The framework used CLUE (Cross-lingual Urdu-English) corpus which contains documents from two domains: computer literature; and general area. The main outcome of paper is to detect the Cross lingual plagiarism (Urdu to English). Translational quality of Google and Bing is analyzed for translation of source documents. Empirical results have shown that plagiarism ratio varied with translational tools for different plagiarism cases. Experiments have shown that palgiarism ratio in higher in Near Copy documents, moderate in light Revision and least in Heavy Revision documents. This research is useful to understand the scenarios in which certain translational tools are effective.

Downloads

Published

2022-12-16

How to Cite

F. Shahzad. (2022). DOMAIN-SPECIFIC CROSS-LINGUAL URDU TO ENGLISH (CLUE) PLAGIARISM DETECTION. Pakistan Journal of Science, 70(2). https://doi.org/10.57041/pjs.v70i2.164

Most read articles by the same author(s)