New York Tax Pickup Profile

Published: 11 June 2021| Version 1 | DOI: 10.17632/m49ryv4yt2.1
Contributor:
James Zhang

Description

This dataset contains a simplified profile of taxi pickup in New York City. The package contains three folders: data, py_code and r_code. The data folder contains the arrival profile data. The py_code folder contains the python code for the calculation of dynamic time warping. The r_code folder contains the r language code for hierarchical clustering and silhouette index calculation.

Files

Steps to reproduce

Description of the software code and the usage 1. Description The package contains three folders: data, py_code and r_code. The data folder contains the arrival profile data. The py_code folder contains the python code for the calculation of dynamic time warping. The r_code folder contains the r language code for hierarchical clustering and silhouette index calculation. 2. Run Python Code To run the code, first install python (> 3.6), then add path to installed python to the system path. Preferably python is installed from Anaconda. Then unzip the zip file to a local path. Open a command console, cd to the local folder where the zip file is unzipped. Then change directory to “py_code”, and run the following command to install the required libraries: pip install -r requirements.txt next set PYTHONPATH=%PYTHONPATH%;. Next, python -m input_model_paper.taxi.dtw_calculation this will generate the dynamic time warping distance file called "dtw_taxi_1.csv" in data/taxi/processed folder. 3. Run R Code First install R program if not already installed. Then find the path to the RScript.exe, and copy it. Then in the directory where the files are unzipped (move one level up the py_code directory if required), run the following command: "C:\Program Files\R\R-3.6.1\bin\Rscript" .\r_code\input_model_paper\taxi\cluster_taxi.R this will create the chart that visualises the DTW distance and the hierarchical clustering results. To calculate the silhouette index and compare the silhouette width, run the following command: "C:\Program Files\R\R-3.6.1\bin\Rscript" .\r_code\input_model_paper\taxi\silhouette.R

Institutions

Deakin University - Geelong Campus at Waurn Ponds

Categories

Data Modelling, Computer Simulation

Licence