Published: 03-03-2021| Version 4 | DOI: 10.17632/thtndvvp9s.4
Francesco Taglino,
Anna Formica


This dataset represents the results of the experimentation of a method for evaluating semantic similarity between concepts in a taxonomy. The method is based on the information-theoretic approach and allows senses of concepts in a given context to be considered. The dataset is composed of 28 files. Each file refers to one pair of the well-known Miller and Charles benchmark dataset [1] for assessing semantic similarity. For each pair of concepts, the same 28 pairs are all considered as possible different contexts. We applied our proposal by extending 7 methods for computing semantic similarity in a taxonomy, selected from the literature. The methods considered in the experiment are referred to as (R[2], W&P[3], L[4], J&C[5], P&S[6], A[7], A&M[8]): REFERENCES [1] Miller, G.A., Charles, W.G. Contextual correlates of semantic similarity. Language and Cognitive Processes 6(1), 1-28 (1991) [2] Resnik, P. {\em Using Information Content to Evaluate Semantic Similarity in a Taxonomy}. In Proc. of the Int. Joint Conf. on Artificial Intelligence, Montreal, Quebec, Canada, August 20-25, Morgan Kaufmann, 448-453 (1995)]. [3] Wu, Z., Palmer, M. Verb semantics and lexical selection. In Proc. of the 32nd Annual Meeting of the Associations for Computational Linguistics, Las Cruces, New Mexico, 133-138 (1994). [4] Lin, D. An Information-Theoretic Definition of Similarity. In Proceedings of the Int. Conf. on Machine Learning, Madison, Wisconsin, USA. Morgan Kaufmann, 296-304 (1998). [5] Jiang, J.J., Conrath, D.W. Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In Proc. of Inter. Conf. Research on Computational Linguistics (ROCLING X), Taiwan (1997). [6] Pirrò, G. A Semantic Similarity Metric Combining Features and Intrinsic Information Content. Data Knowl. Eng, 68(11), 1289-1308 (2009). [7] Adhikari, A., Dutta, B., Dutta, A., Mondal, D., Singh, S. An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology. J. Assoc. Inf. Sci. Technol. 69(8), 1023-1034 (2018). [8] Adhikari, A., Singh, S., Mondal, D., Dutta, B., Dutta, A. A Novel Information Theoretic Framework for Finding Semantic Similarity in WordNet. CoRR, arXiv:1607.05422, abs/1607.05422 (2016). Finally, in each file, the correlation of our proposal with respect to human judgement is reported.