Published: 22 October 2021| Version 2 | DOI: 10.17632/ck365nbpk9.2
, Francesco Taglino


This dataset represents the results of the experimentation of a method for evaluating semantic similarity between concepts in a taxonomy. The method is based on the information-theoretic approach and allows senses of concepts in a given context to be considered. Relevance of senses is calculated in terms of semantic relatedness with the compared concepts. In a previous work [9], the adopted semantic relatedness method was the one described in [10], while in this work we adopted the one described in [11]. This results in an improvement of the method. The dataset is composed of two folders, which contain the results of the previous and the new experimentation, respectively. In particular, in each folder there is a set of files, each referring to one pair of the well-known Miller and Charles benchmark dataset [1] for assessing semantic similarity. For each pair of concepts, the same 28 pairs are all considered as possible different contexts. We applied our proposal by extending 7 methods for computing semantic similarity in a taxonomy, selected from the literature. The methods considered in the experiment are referred to as (R[2], W&P[3], L[4], J&C[5], P&S[6], A[7], A&M[8]): REFERENCES [1] Miller G.A., Charles W.G. 1991. Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1). [2] Resnik P. 1995. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. Int. Joint Conf. on Artificial Intelligence, Montreal. [3] Wu Z., Palmer M. 1994. Verb semantics and lexical selection. 32nd Annual Meeting of the Associations for Computational Linguistics. [4] Lin D. 1998. An Information-Theoretic Definition of Similarity. Int. Conf. on Machine Learning. [5] Jiang J.J., Conrath D.W. 1997. Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. Inter. Conf. Research on Computational Linguistics. [6] Pirrò G. 2009. A Semantic Similarity Metric Combining Features and Intrinsic Information Content. Data Knowl. Eng, 68(11). [7] Adhikari A., Dutta B., Dutta A., Mondal D., Singh S. 2018. An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology. J. Assoc. Inf. Sci. Technol. 69(8). [8] Adhikari A., Singh S., Mondal D., Dutta B., Dutta A. 2016. A Novel Information Theoretic Framework for Finding Semantic Similarity in WordNet. CoRR, arXiv:1607.05422, abs/1607.05422. [9] Formica A., Taglino F. 2021. An Enriched Information-Theoretic Definition of Semantic Similarity in a Taxonomy. IEEE Access, vol. 9. [10] Schuhmacher M., Ponzetto S. P. 2014. Knowledge-based Graph Document Modeling. 7th ACM International Conference on Web Search and Data Mining. [11] El Vaigh C. B., Goasdoué F., Gravier G., Sébillot P. 2020. A Novel Path-Based Entity Relatedness Measure for Efficient Collective Entity Linking. ISWC 2020. Finally, in each file, the Pearson's and the Spearman's correlations of our proposal with respect to human judgement is reported.



Consiglio Nazionale delle Ricerche


Reasoning in Semantics