Data for: A scalable modeling framework for massive machine learning-based land change simulations: Applying the k-means clustering scheme and the Spark cluster computing environment for model calibration

Published: 30-10-2018| Version 1 | DOI: 10.17632/s8zjfsy9gw.1
Omrani Hichem,
Marco Helbich,
Bryan Pijanowski,
Benoit Parmentier


Three land use datasets from the USA (Wisconsin, Boston, and Boston). For the datasets we simulated the difference between urban-gain and non-urban persistence between two time periods. We excluded the urban class in the initial time because it is impossible for this urban class to have any urban-gain or non-urban persistence across two time points. Furthermore, a set of variables was defined for each cell serving as driving factors. There are six variables in 1978 for Muskegon, eight variables in 1998 for Boston, and sixteen variables in 1990 for Wisconsin, as inputs and urban change maps between two time periods (1978-1998 in Muskegon, 1971-1999 in Boston, 1990-2000 in Wisconsin) as outputs. The cells of land use have a spatial resolution of 100, 2, and 30 meters in Muskegon, Boston, and Wisconsin. These datasets could be used for instance to perform a cross-model comparison among many other purposes.