Language Resources for Intrinsic Plagiarism Detection in Urdu Language

Name: Language Resources for Intrinsic Plagiarism Detection in Urdu Language
Creator: Faraz Manzoor
Published: 2022-09-02T20:50:37.419Z
Keywords: Machine Learning, Plagiarism, Urdu Language, Deep Learning

Manzoor, Faraz; Farooq, Muhammad Shoaib; Abid, Adnan; Alvi, Atif

doi:10.17632/8fknny5s5p.1

Language Resources for Intrinsic Plagiarism Detection in Urdu Language

Published: 2 September 2022| Version 1 | DOI: 10.17632/8fknny5s5p.1

Contributors:

Faraz Manzoor, Muhammad Shoaib Farooq, Adnan Abid, Atif Alvi

Description

This is a dataset based on the intrinsic plagiarism . To produce a high-quality dataset to train the classification algorithm, we have gathered the Urdu essays and reports from various popular and highly trending websites such as, jang.com, urduessaypoint.blogspot.com, www.dawnnews.tv etc. All the documents gathered from the websites are then compiled in .txt format. More than 500 plagiarized and unplagiarized documents are created systematically

Files

Institutions

University of Management and Technology

Language Resources for Intrinsic Plagiarism Detection in Urdu Language

Description

Files

Institutions

Categories

Licence