Results of cross-lingual text reuse detection among European Universities

Published: 18 March 2022| Version 2 | DOI: 10.17632/t726dmtx24.2
Contributors:
,
,
, Georgy Gorbachev,
,
,
, Andrey Khazov, Vladislav Komarnitsky,
,
, Aleksandra Sakharova

Description

The dataset with results of the experiment for the article "Cross-language plagiarism detection: a case study of European languages academic works". The dataset contains analyzed documents from OATD (only those, allowed by the document license), sources of the documents (only from open sources), and reports. The names of the sources and analyzed documents are md5-hash from their URLs. Each report in json format contains the URLs of the documents, offsets, and lengths of each detected case and the first and last tokens of found case.

Files

Categories

Plagiarism

Licence