SNT (Simulated Network Traffic Using Mininet and Ryu) DDoS Detection Dataset
Description
Research Hypothesis The primary hypothesis of this research is that machine learning (ML) models can effectively detect Distributed Denial of Service (DDoS) attacks in Software-Defined Networking (SDN) environments by leveraging flow-level network traffic features. The dataset was designed to evaluate ML algorithms' ability to distinguish between normal and malicious traffic under realistic network conditions. What the Data Shows This dataset captures detailed flow statistics for normal and malicious network traffic: Normal Traffic: Simulated regular client-server communications in a realistic data center environment. Malicious Traffic: Traffic generated by launching DDoS attacks such as ICMP flood, UDP flood, TCP SYN flood, and LAND attack. The dataset provides labeled traffic data, allowing users to train and test ML models for binary classification (normal vs. DDoS) or multi-class classification (by attack type). Notable Findings Balanced Data Distribution: The dataset includes 1,034,669 records: 527,576 benign flows (normal traffic). 507,093 malicious flows (DDoS attacks). Comprehensive Feature Set: 22 flow-level features, such as protocol type, packet counts, byte counts, and traffic duration, enabling detailed analysis of traffic patterns. High-Quality Labels: Each flow is accurately labeled for supervised ML tasks, with attack traffic optionally classified by type. How the Data Was Gathered Simulation Environment: Mininet: Used to simulate the network topology with multiple hosts and switches. Ryu SDN Controller: Managed traffic and collected flow statistics via its flow monitoring feature. Normal Traffic: Generated using a Python script that simulated client-server interactions with randomized patterns, mimicking real-world user behavior. Traffic size distribution was modeled on typical data center workloads. Malicious Traffic: Created by launching DDoS attacks (e.g., ICMP flood, UDP flood) using tools like hping3. Attack patterns were randomized to simulate real-world scenarios. The traffic was monitored, captured, and exported as a labeled CSV file for analysis. Interpreting the Data Labels: 1 for normal traffic. 0 for DDoS traffic (or further classified by attack type). Features: Each row represents a single flow with detailed attributes such as packet counts, duration, and protocol type. Use Cases: Train ML models for DDoS detection. Analyze traffic patterns for feature importance. Explore SDN-based security and anomaly detection techniques. Usage Recommendations This dataset is ideal for researchers and practitioners working on: DDoS detection and mitigation strategies. Network traffic analysis in SDN environments. Feature engineering for network security tasks.