TabbyXL: Dataset for the Performance Evaluation of a Software Platform for Rule-Based Spreadsheet Data Extraction and Transformation

Published: 16 December 2019| Version 6 | DOI: 10.17632/ydcr7mcrtp.6
Contributor:
Alexey Shigarov

Description

This dataset is designed to evaluate TabbyXL (version 1.1.0), a software platform for the rule-based transformation of spreadsheet data from arbitrary to relational tables, that is freely available at [GitHub](https://github.com/tabbydoc/tabbyxl/releases/tag/v1.1.0). The dataset provides all required data to reproduce the performance evaluation including the program running and automatic performance evaluation of TabbyXL. The performance evaluation confirms the applicability of the implemented rulesets to process a bunch of different arbitrary tables of the same genre (government statistical websites). This demonstrates that TabbyXL can be used for developing programs for the transformation of spreadsheet data into the relational form. README.md file included in this dataset provides a detail description of the data and steps to reproduce the experiment.

Files

Steps to reproduce

All steps to reproduce the experiment are presented in README.md file included in the dataset.

Institutions

Institut dinamiki sistem i teorii upravlenia imeni V M Matrosova SO RAN

Categories

Spreadsheet, Document Analysis, Data Integration, Information Extraction, Database

Licence