Data for: Application of deep learning models to provide a generalizable approach for cloud, shadow and land cover classification in PlanetScope and Sentinel-2 imagery

Published: 17 September 2019| Version 1 | DOI: 10.17632/6gdybpjnwh.1
Iurii Shendryk


The dataset consists of PlanetScope and Sentinel-2 derived scenes (i.e. chips) collected over the Wet Tropics of Australia between December 1, 2016 and November 1, 2017. All PlanetScope imagery contains four bands of data: red, green, blue (RGB) and near infrared (NIR), and had a ground sample distance (GSD) of 3.125 m. In contrast, Sentinel-2 imagery was trimmed to contain only RGB and NIR bands, and resampled to 3.125 m resolution to match PlanetScope imagery. Here we refer to the Wet Tropics PlanetScope- and Sentinel-2-derived data as datasets T-PS and T-S2, respectively. The T-PS and T-S2 datasets were generated by splitting satellite imagery with a grid of 128×128 pixels (i.e. 400×400 m) into image scenes, and extracting a random sample of 2.5% and 0.5% scenes per time step, respectively. This resulted in 4,943 and 4,993 image scenes in T-PS and T-S2 datasets, respectively. Datasets T-PS and T-S2 were manually labeled with 12 labels split into three groups: 1) Cloud labels (‘clear’, ‘partly cloudy’, ‘cloudy’, ‘haze’). 2) Shade labels (‘unshaded’, ‘partly shaded’, ‘shaded’), which indicated the level of shade caused by cloud cover on a specific scene. 3) Land cover labels (‘forest’, ‘bare ground’, ‘water’, ‘agriculture’, ‘habitation’). Labels belonging to cloud and shade label groups are mutually exclusive for each scene, that is, each image scene received exactly one shade label and one cloud label.



Remote Sensing, Image Classification, Convolutional Neural Network, Deep Learning, Satellite Image