HESML V1R3 Java software library of ontology-based semantic similarity measures and information content models

Published: 3 October 2017| Version 3 | DOI: 10.17632/t87s78dg78.3
Contributors:
Juan J. Lastra-Díaz,

Description

HESML V1R3 is the third release of the Half-Edge Semantic Measures Library (HESML) detailed in [1], which is a new, scalable and efficient Java software library of ontology-based semantic similarity measures and Information Content (IC) models based on WordNet. HESML V1R3 implements most ontology-based semantic similarity measures and Information Content (IC) models based on WordNet reported in the literature. It also provides a XML-based input file format in order to specify the execution of reproducible experiments on WordNet-based similarity, even with no software coding. The main features of HESML are as follows: (1) it is based on an efficient and linearly scalable representation for taxonomies called PosetHERep introduced in [1], (2) its performance exhibits a linear scalability as regards the size of the taxonomy, and (3) it does not use any caching strategy of vertex sets. HESML V1R3 introduces two minor novelties as follows: the vertex ID has been updated from Integer to Long type in order to support a larger number of vertexes, and it includes five new similarity measures introduced by Hao et al (2011), Liu et al (2007), Pekar&Staab (2002) and Stojanovic et al (2001). HESML library is freely distributed for any non-commercial purpose under a CC By-NC-SA-4.0 license, subject to the citing of the main HESML paper [1] as attribution requirement. On other hand, the commercial use of the similarity measures introduced in [2], as well as part of the intrinsic IC models introduced in [3] and [4], is protected by a patent application [5]. In addition, any user of HESML must fulfill other licensing terms described in [1] related to other resources distributed with the library, such as WordNet and a dataset of corpus-based IC models, among others. References: [1] Lastra-Díaz, J. J., García-Serrano, A., Batet, M., Fernández, M., & Chirigati, F. (2017). HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Information Systems, 66, 97–118. http:dx.doi.org/10.1016/j.is.2017.02.002 [2] Lastra-Díaz, J. J., & García-Serrano, A. (2015). A novel family of IC-based similarity measures with a detailed experimental survey on WordNet. Engineering Applications of Artificial Intelligence Journal, 46, 140–153. [3] Lastra-Díaz, J. J., & García-Serrano, A. (2015). A new family of information content models with an experimental survey on WordNet. Knowledge-Based Systems, 89, 509–526. [4] Lastra-Díaz, J. J., & García-Serrano, A. (2016). A refinement of the well-founded Information Content models with a very detailed experimental survey on WordNet. Universidad Nacional de Educación a Distancia (UNED). [5] Lastra Díaz, J. J., & García Serrano, A. (2016). System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model. United States Patent and Trademark Office (USPTO) Application, US2016/0179945 A1.

Files

Steps to reproduce

HESML V1R3 is distributed as a Java class library (HESML-V1R3.jar) plus a test driver application (HESMLclient.jar), which have been developed using NetBeans 8.0.2 for Windows, although it has been also compiled and evaluated on Linux-based platforms using the corresponding NetBeans versions. In order to follow HESML development, we refer the reader to the HESML permanent GitHub repository at https://github.com/jjlastra/HESML.git In order to compile HESML V1R3, you must follow the following steps: (1) Download the ZIP file above containing the full distribution of HESML V1R3. (2) Install Java 8, Java SE Dev Kit 8 and NetBeans 8.0.2 or higher in your workstation. (3) Launch NetBeans IDE and open the HESML and HESMLclient projects contained in the root folder. NetBeans automatically detects the presence of a nbproject subfolder with the project files. (4) Select HESML and HESMLclient projects in the project treeview respectively. Then, invoke the "Clean and Build project (Shift + F11)" command in order to compile both projects. In order to remain up to date on new HESML versions, as well as asking for technical support, we invite the readers to subscribe to the HESML forum by sending an email to the following address: hesml+subscribe@googlegroups.com For more information, we refer the reader to the paper [1] above.

Institutions

Universidad Nacional de Educacion a Distancia

Categories

Similarity Measure, Ontological Models

Licence