AdDDoSDN: Adversarial DDoS Attacks Dataset for Software-Defined Networks

Published: 19 September 2025| Version 1 | DOI: 10.17632/9jp6r68y98.1
Contributor:

Description

The AdDDoSDN dataset is a comprehensive network traffic corpus built for defensive SDN research, capturing coordinated DDoS attacks and benign enterprise activity through controlled Mininet experiments driven by a remote Ryu L3 controller to deliver high-quality labeled data for real-time detection development. The environment emulates a segmented four-subnet enterprise: h1 (192.168.10.10/24) acts as the external attacker, h2–h5 (192.168.20.10–13/24) form the corporate client subnet with h2 handling ICMP exchanges and h3/h5 generating rich TCP and UDP application sessions, h6 (192.168.30.10/24) resides in the server/DMZ subnet as the primary victim, and controller services operate on 192.168.0.0/24, providing realistic inter-subnet attack paths while preserving centralized SDN visibility. The dataset follows a structured, configurable timeline sourced from config.json, with the default cycle spanning roughly 35 minutes per run: a 5-second initialization period, 1,600 seconds of benign traffic mixing ICMP, Telnet, SSH, FTP, HTTP/S, and DNS exchanges, enhanced traditional attacks from h1 including an 88-second SYN flood and 176-second UDP flood against h6, plus an 88-second ICMP flood toward h4, and adversarial attacks from h1 to h6 comprising a 72-second TCP state-exhaustion phase with human-like timing patterns, a 24-second application-layer mimicry burst combining heavy HTTP range/post requests with legitimate queries, and a 72-second slow-read phase sustaining long-lived connections. Traditional phases operate around 20–30 packets per second with protocol-compliant options, while adversarial scripts emphasize mimicry and timing jitter. The dataset provides three synchronized data products derived from each capture cycle: 1. Packet-level data (adddosdn_packet_dataset.csvv): 30 header fields + 2 labels extracted directly from PCAP phases. 2. SDN flow-level data (adddosdn_flow_dataset.csv): Controller statistics with derived rates and labels collected via the Ryu REST API. 3. CICFlow aggregated data (adddosdn_cicflow_dataset.csv): 85 bidirectional behavioral features generated with CICFlowMeter. The dataset demonstrates exceptional quality containing 3.5 million total records across dataset instances, each representing different temporal scenarios. Labels span normal, syn_flood, udp_flood, icmp_flood, ad_syn, ad_udp, and ad_slow, with Label_binary collapsing them into benign (0) versus malicious (1) classes to maintain consistency across packet, controller-flow, and behavioral representations.

Files

Steps to reproduce

1. Bulk Dataset Generation sudo python3 dataset_generation/run_bulk_mainv4.py --runs 10 2. Data Transfer to Processing Server scp -r user@server:/path/to/dataset_generation/main_output/v4/ ./local_backup/ 3. Transfer to CICFlow Processing Server scp -r ./local_backup/v4/ user@cicflowserver:/path/to/processing/ 4. CICFlow Feature Extraction python3 test/run_cicflow_v2_main.py main_output/v4 cicflow_output/v4 5. Transfer CICFlow Results Back scp -r user@cicflowserver:/path/to/cicflow_output/v4/ ./dataset_generation/cicflow_output/ 6. Dataset Combination and Synchronization cd dataset_generation/dataset_cleanup/ python3 combine_datasets.py --path ../main_output/v4 7. Quality Investigation and Analysis python3 investigate_csv_quality.py --path ../main_output/v4 8. Timeline Validation python3 analyze_timeline_v3.py --path ../main_output/v4 9. Data Leakage Assessment python3 assess_data_leakage.py --path ../main_output/v4 10. Data Leakage Validation python3 validate_data_leakage.py --path ../main_output/v4

Institutions

Universiti Kebangsaan Malaysia

Categories

Cybersecurity, Network Security, Machine Learning

Licence