Pygotham 2017

Published: 6 October 2017| Version 1 | DOI: 10.17632/8kyckg3dh5.1
Contributors:
Jessica Cox,
Corey Harper

Description

This dataset contains 4 files: 1. A .csv containing 29,105 sentences from CC-BY papers that contain citations ("pygothamCleanDataset.csv"). 2. A community edition databricks notebook to process and explore the data as .dbc 3. A community edition databricks notebook to view in HTML. 3. Pygotham slides in PDF format.

Files

Steps to reproduce

Make sure to update all paths! Please see this link for an archived copy of the notebook with all output: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/2644196477475309/2247597868200546/3108286398802724/latest.html

Categories

Data Analysis

License