Expert Annotated Citation Context Pairs into Semantic Similarity Classes

Published: 25 May 2021| Version 1 | DOI: 10.17632/6mb5hmsnp2.1
Contributor:
Toluwase Asubiaro

Description

981 citation context pairs that were manually collected from biomedical publications were annotated by human experts into three semantic similarity classes. The three semantic similarity classes are: similar, somewhat similar and not similar. The experts were trained on how to code the citation context pairs. During the training, they were shown the modalities of classifying citation contexts into semantic similarity classes. One of the overarching instruction was to consider the similarity of concepts/keywords in the two citation contexts as a basis for drawing semantic similarities. The experts grouped citation contexts under the “not similar” class if the citation context pairs did not share similar concepts or keywords or the meanings of most or all the concepts or keywords in the citation contexts were different, thereby making the meaning of the citation context pairs not similar. On the other hand, citation context pairs were grouped under the “somewhat similar” class if the two citation contexts share some similar concepts or the keywords/concepts in the two citation contexts share some meanings. Lastly, experts classified citation contexts as “similar” if all the concepts/keywords are similar and, therefore, the citation context pair are similar in meaning. The two experts independently annotated all the sampled 981 citation context pairs. pairs.

Files

Institutions

Western University

Categories

Data Science, Machine Learning, Knowledge Representation, Scholarly Communication, Bibliometrics, Biomedical Research

Licence