Contributors:Víctor Labayen, Eduardo Magana, Daniel Morato Oses, Mikel Izal
The dataset is a set of network traffic traces in pcap/csv format captured from a single user. The traffic is classified in 5 different activities (Video, Bulk, Idle, Web, and Interactive) and the label is shown in the filename. There is also a file (mapping.csv) with the mapping of the host's IP address, the csv/pcap filename and the activity label.
Interactive: applications that perform real-time interactions in order to provide a suitable user experience, such as editing a file in google docs and remote CLI's sessions by SSH.
Bulk data transfer: applications that perform a transfer of large data volume files over the network. Some examples are SCP/FTP applications and direct downloads of large files from web servers like Mediafire, Dropbox or the university repository among others.
Web browsing: contains all the generated traffic while searching and consuming different web pages. Examples of those pages are several blogs and new sites and the moodle of the university.
Vídeo playback: contains traffic from applications that consume video in streaming or pseudo-streaming. The most known server used are Twitch and Youtube but the university online classroom has also been used.
Idle behaviour: is composed by the background traffic generated by the user computer when the user is idle. This traffic has been captured with every application closed and with some opened pages like google docs, YouTube and several web pages, but always without user interaction.
The capture is performed in a network probe, attached to the router that forwards the user network traffic, using a SPAN port. The traffic is stored in pcap format with all the packet payload. In the csv file, every non TCP/UDP packet is filtered out, as well as every packet with no payload. The fields in the csv files are the following (one line per packet): Timestamp, protocol, payload size, IP address source and destination, UDP/TCP port source and destination. The fields are also included as a header in every csv file.
The amount of data is stated as follows:
Bulk : 19 traces, 3599 s of total duration, 8704 MBytes of pcap files
Video : 23 traces, 4496 s, 1405 MBytes
Web : 23 traces, 4203 s, 148 MBytes
Interactive : 42 traces, 8934 s, 30.5 MBytes
Idle : 52 traces, 6341 s, 0.69 MBytes
The code of our machine learning approach is also included. There is a README.txt file with the documentation of how to use the code.
Contributors:Vinh Truong Hoang
This paper introduces a new dataset for solving the ground-based cloud images classification task. We name it ‘Cloud-ImVN 1.0’ which is an extension of SWIMCAT database. This dataset contains 6 categories of clouds images which consists of 2,100 color images (150 × 150 pixels). Several task can be applied on this dataset including classification, clustering and segmentation in both supervised and unsupervised learning context.
This is the main data for the paper “Soil organic carbon redistribution and delivery by water erosion in a small catchment of the Yellow River basin”，including the 14C date of the source area and sink area, the 137Cs date of the sediment profile, the
Contributors:Shashank Kr Mishra
The compressed file contains the data, followed up with a readme file
Contributors:Mohammad Royapoor, Sara Walker, Charalampos Patsios, Peter Davison, Mehdi Pajouhesh
Data in summary:
1- Building total B side: This is metered data from one of two mains busbars that supplies all none-emergency services and HVAC equipment
2- Building total A side: This is metered data from the second of two mains busbars that supplies all emergency services including fire safety, comm rooms, emergency lighting and public announcement. It also is connected to a PV array with peak electrical supply of around 33kWe.
3- Half hourly building demand and deferrable load breakdowns: This is processed data that includes building total and HH instances of deferrable loads for all sub-categories of loads considered in this work. It also includes HH instances of PV generation, and outside air temperature.
4- Early morning ramp rates following plant start-up: This is a file containing the difference between two instantaneous recordings of total building electricity consumption that shows the continuous fluctuation in total electricity demand in the building.
5- CO2-raw (Typical office): This files contains actual CO2 data in an office that represents typical space occupant density in the case study building.
6- CO2-raw (worst case): This files contains actual CO2 data in a teaching space that represents the highest observed space occupant density in the case study building.
7- Warming and cooling rates in the worst case zones: This file include actual data describing the operational temperature in the worst case zones most prone to overheating in summer and excessive heat loss in winter.
Contributors:Tae Keun Yoo
This study included 451 anonymized UWF and 745 FP images. The ultra-widefield (UWF) images, which include both normal and pathologic retinal images, were based on Tsukazaki Optos Public Project. The traditional fundus photograph (FP) images were extracted from the publicly accessible database by using the Google image and Google dataset search that included English keywords related to retina. The search strategy was based on the following key terms: “fundus photography”, “retinal image”, and “fundus dataset”. The images were manually reviewed by two board-certified ophthalmologists, and blurred and low-quality images were removed to clarify the image domains. Duplicated images were also removed. Consequently, 451 images with artifacts and 745 images without artifacts were collected.
The UWF images were cropped and masked after registration for CycleGAN.
Contributors:Sattar Dorafshan, Hoda Azari
The dataset includes 2,016 impact echo signals from eight identical laboratory-made concrete specimens. This dataset is annotated in two classes: sound concrete (Class S) and defected concrete (Class D).
Contributors:Julianne Oliveira, Diego Carvalho, Everton Santos, Rubens Lamparelli, Gleyce Figueiredo, Edemar Moro, Ana Flávia Bonamigo, Johnny Soares, LEONARDO MONTEIRO, Murilo Vianna, Eleanor Campbell, Deepak Jaiswal, Lee Lynd, John Sheehan
The objective of this dataset was to present the forage biomass production over time in different pasture management systems. We selected two farms located in the Western region of São Paulo State, Brazil. Pasture field data collection was carried out in two farms during three dates (June and November 2018 and March 2019) over two seasons (wet and dry). Samples were regularly taken through time to monitor forage biomass. These fields represent a wide variety of pasture management, as follow:
Farm 1 (Santa Clara): i) traditional, low forage productivity, cattle rotation; ii) traditional, intermediate forage productivity, fertilized, cattle rotation; iii) intensified pasture, high forage productivity, reformed, cattle rotation.
Farm 2 (Poderosa): i) traditional degraded*, recently reformed with millet + grass, cattle rotation; ii) traditional, low forage productivity, signs of degradation, fertilized, cattle rotation. *degraded was based on visual analysis of pasture area with sparse grass and exposed soil in some areas.
With the support of NDVI images from the MODIS sensor, sample pixels were used to allocate the sample points. The areas of these pixels were divided into nine sampling points and in each of these points, the forage biomass was collected. Soil analyses were also carried out in two seasons (June 2018 and March 2019).
The data files were organized in three folders. Each folder represents one field campaign. These folders have a shapefile of all the fields, the same file in kml extension (to open on Google Earth) and a zip file with photography of each field during the field campaign. The attribute table of the shapefile has a description of the fields and biomass. Excel files show the same information of the attribute table and a description of the items. A figure with the template of the biomass collection scheme is also available. Soil analyses are in the folders 'June 2018' and 'March 2019'. A more detailed description and discussion about these data and their association with soil chemical analysis were described in a scientific report (available by request).
The biomass collection allowed the analysis of the forage production and better diagnoses about resource utilization strategies over the different pasture systems.
This work was funded by the São Paulo Research Foundation (process numbers 2018/10770-1, 2017/06037-4, 2016/08741-8, 2017/08970-0, 2018/11052-5 and 2014/26767-9) as part of the Global Sustainable Bioenergy Initiative.