Time series dataset for network security situational awareness
Description
In the field of network security situational awareness (NSSA), it is challenging to find a usable dataset. Most of the datasets used in existing research papers are outdated, small, publicly unavailable due to the private infrastructure on which they were created, or unusable for other reasons. This paper presents a new dataset derived from a well-documented and substantial source, suitable for use with neural networks that require larger datasets than classical machine learning approaches. This dataset can help the research community in various ways. The dataset consists of four parts, each containing the time series generated from cybersecurity alerts collected between 2017 and 2018 and between 2023 and 2024. Alerts were collected from the Warden system, which collects and shares information about security events detected by various security systems across multiple organizations. In total, about three billion alerts were collected and processed to time series.