Language Resources for Intrinsic Plagiarism Detection in Urdu Language
Published: 2 September 2022| Version 1 | DOI: 10.17632/8fknny5s5p.1
Contributors:
Faraz Manzoor, , Adnan Abid, Atif AlviDescription
This is a dataset based on the intrinsic plagiarism . To produce a high-quality dataset to train the classification algorithm, we have gathered the Urdu essays and reports from various popular and highly trending websites such as, jang.com, urduessaypoint.blogspot.com, www.dawnnews.tv etc. All the documents gathered from the websites are then compiled in .txt format. More than 500 plagiarized and unplagiarized documents are created systematically
Files
Institutions
University of Management and Technology
Categories
Machine Learning, Plagiarism, Urdu Language, Deep Learning