Raw Data for Algorithmic Justice Indicators in Tourism Marketing (realistic version, 5,155 rows)
Description
This dataset contains the standardized raw outputs used to compute the three analytical indicators—Visibility (V), Epistemic Diversity Ratio (EDR), and Algorithmic Legibility Score (ALS)—presented in Table 3 of the article “Algorithmic Justice in Tourism Marketing.” It includes 5,155 rows of harmonized data from five systems (ChatGPT, Gemini, Booking.com, TripAdvisor, Google Travel), each representing ranked tourism-related entities and corresponding reasoning flags. The file enables full replication of the indicator computation pipeline implemented in compute_indicators.py. It is in Gonzalez Barbado, M. D. (2025). Replication Data for "Algorithmic Justice in Tourism Marketing (Versión 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.17443501
Files
Steps to reproduce
📄 Instructions in Word Format (English Translation) Here is the requested text, translated into English and formatted for compatibility with a word processor like Word, maintaining the requested academic and concise style with bolding for structural clarity: ⚙️ Steps to Reproduce the Study The replication of this study’s results follows a deterministic four-step pipeline. All files, scripts, and datasets are openly accessible via the Zenodo and Mendeley repositories. 1. Download Datasets Retrieve both open datasets: (a) Replication Data for Algorithmic Justice in Tourism Marketing from Zenodo (https://doi.org/10.5281/zenodo.17443501). (b) Raw Data for Algorithmic Justice Indicators in Tourism Marketing (realistic version, 5,155 rows) from Mendeley Data (DOI pending assignment). Place both files in the same working directory. 2. Install Dependencies Run the following command in a Python 3.11 environment: pip install pandas numpy scikit-learn matplotlib seaborn Note: Optionally use a virtual environment or reproducible container (the requirements.txt file is provided in the Zenodo package). 3. Execute Computation Script Launch the indicator calculation script. This command is executed using the system's Python interpreter: python compute_indicators.py --input raw_data_realistic.csv --output indicator_results.csv The script automatically computes the three analytical indicators—Visibility (V), Epistemic Diversity Ratio (EDR), and Algorithmic Legibility Score (ALS)—following the formulations in Section 3 of the paper. The process uses deterministic random seeds (numpy.random.seed(42)) for exact reproducibility. 4. Generate Figures and Tables The resulting indicator_results.csv file is used to create Figure 1 and Table 1. Visualization scripts are provided in analysis_script.py. Running the command: python analysis_script.py will reproduce all charts, including the comparative indicator plot (indicator_trends.pdf). Note: All computations were performed on macOS 14.6 with Python 3.11.6 and NumPy 1.26. The workflow is platform-independent and validated for Linux (Ubuntu 22.04) and Windows 11. Metadata files in both repositories include data dictionaries, version hashes, and provenance information for verification.
Institutions
- Universitat Oberta de CatalunyaCatalunya, Barcelona