Coding of Data Cleaning and Processing

Published: 8 February 2019| Version 1 | DOI: 10.17632/xd74wbskxg.1
Lingyu Meng


The specific process was mainly implemented through the MySQL database. Some invalid data was removed. The first part of the removed data was incomplete and error data. Because these data could cause mistakes to the results. The second part of removed data was the trip over 60 minutes and 14 kilometers because these trips were considered personal use such as shopping instead of commuting. Finally, this research used a total of 437,053 trip records about public bike systems and 7,051 trip records about bike sharing systems as the dataset.



Transport, Smartcards, Big Data