Published: 5 July 2022| Version 2 | DOI: 10.17632/4pw8vfsnpx.2
Bruno Juncklaus Martins,
, Maurici Monteiro,


Clouds-1000 is a dataset of 1000 sky images captured with cameras directed towards the horizon in the north and south directions in an area with a good view of the sky in the UFSC Photovoltaic Laboratory at the Federal University of Santa Catarina, in the city of Florianópolis/SC-Brazil. The images were collected every minute over the period of March–October of 2021. Each image was annotated with a polygon tool and classified using 4 cloud types: Cirriform, Cumuliform, Stratiform, and Stratocumuliform, and 1 class representing trees and buildings. This classification is based on the solar radiation absorption characteristics of clouds. For the task of image annotation, our research team was divided into 3 Data Analysts responsible for analyzing and labeling the images, and 2 meteorologists responsible for supervision and validation. Sylvio Mantelli is a PhD from INPE, working on our research team and helped with data labeling mentoring and several analysis throughout development of the dataset. Maurici Amantino Monteiro, professor and currently climatologist at Aqueris Engenharia e Soluções Ambientais. With several years of experience in synoptic observation, he helped us to understand the nature of clouds and associate them with their corresponding classes. The annotations were handmade using the Supervisely tool. The tool was created for image annotation and data management in which it's possible to create the annotations via interface available, similar to other image editors. Each image was annotated with the polygon tool and classified using 4 cloud types: Cirriform, Cumuliform, Stratiform, Stratocumuliform and 1 class representing trees and buildings. This classification is based on solar radiation absorption characteristics. Due to the humid climate of the region, the Cumulonimbus (Cb) cloud seldom forms. This type of cloud usually form in dryer regions, thus we won't find any occurrence of this cloud in the dataset. The dataset faced several validations and during an inspection we found 4 images that were either partially annotated or missing annotation entirely. Therefore, the latest and current version of the Clouds-1000 dataset is composed of 996 fully hand-annotated images. For more information see the github repository.


Steps to reproduce

Nimbus Gazer uses motionEye version 0.41 and Motion version 4.2.2. The system is set to GMT location zone and prevents any camera LEDs from blinking, by disabling all LEDs in the boot configuration file. The system is set to start capturing images at 08:00 GMT and stop at 22:00 GMT. The chosen time interval was defined to capture only images with at least some level of sunlight. The location zone of our research lab is at GMT-3. To install the OS, it is necessary to have at least 32GB of free memory. Our configuration of the motion system is set at the lowest available frame rate of 1 frame per minute to match the time resolution of sensory data from our lab. That means that every minute, an image is captured. Captured images are configured at 2592 x 1944 resolution and are stored in a local directory before being uploaded to the cloud (the default directory is /Nuvens/camtest/). We use the built-in option to upload to a Google Drive directory to upload the images. For more details see the related link below.


Universidade Federal de Santa Catarina


Computer Vision, Image Processing, Photovoltaics, Machine Learning, Renewable Energy, Digital Imaging