A survey of current practices in data search services

Published: 14-05-2018| Version 1 | DOI: 10.17632/7j43z6n22z.1
SiriJodha Khalsa,
Peter Cotroneo,
Mingfang Wu


Relevancy ranking is an important component of making a data repository's search system responsive to data seekers’ needs. The Research Data Alliance (RDA) Data Discovery Paradigms Interest Group (https://www.rd-alliance.org/groups/data-discovery-paradigms-ig) is a collaborative activity within our data community which aims to improve data searchability. This survey is intended to gather information about the current practices and lessons learnt by data repositories in implementing relevancy ranking in search systems. We expect that analysis of the survey results will: * Help data repositories choose appropriate technologies when implementing or improving their search functionality; * Provide a means for sharing experiences in improving relevancy ranking; * Capture the aspirations, successes and challenges encountered from research data repository managers; * Help the Data Discovery Paradigms Interest group align future activities on data search improvement with the interests of data search service providers. For the above the purpose, we designed a survey instrument to answer the following topics (the numbers in brackets indicate the number of questions asked per topic): * What are characteristics of each repositories (5)? * What are system configurations (e.g., ranking model, index methods, query methods) (7)? * Evaluation methods and benchmark (10) ** What has been evaluated? ** What evaluation methods have been applied? ** How was the evaluation collection built? ** What is approximate performance range of search systems with certain configuration? * What methods have been used to boost searchability to web search engines (e.g., Google, Bing) (2) * What other technologies or system configurations have been employed (5)? * Wish list for future activities for the RDA relevance task force (2)? This collection consists of survey instrument, survey responses and survey report.