Dataset for a Systematic Mapping Study of Deep Learning-Based DDoS Detection in SDN and 5G/B5G Networks (2018–2024)
Description
This dataset contains the primary studies used in a systematic mapping study (SMS) on deep learning-based Distributed Denial of Service (DDoS) detection in Software-Defined Networking (SDN), 5G, and beyond-5G (B5G) network environments. The dataset includes 208 peer-reviewed publications published between 2018 and 2024, retrieved from Scopus and IEEE Xplore using a structured search query. Each entry corresponds to a single study and includes bibliographic information (title, authors, year, publication venue etc.), as well as manually curated and derived attributes. In addition to bibliographic metadata, the dataset provides structured annotations extracted from the title, abstract, and manual tags, including: - learning_type (e.g., deep learning, reinforcement learning) - dl_architecture (e.g., CNN, LSTM, hybrid models) - network_context (e.g., SDN, 5G, IoT, edge) - dataset_used (e.g., CICDDoS2019, CICIDS2017, custom datasets) - evaluation_setting (e.g., offline, simulation/testbed, real network) The annotations were generated through a semi-automated process combining script-based extraction and manual validation based on titles, abstracts, and manually assigned tags, without full-text analysis. In cases where specific information was not explicitly stated, the corresponding fields were marked as "Not specified". This dataset is intended to support reproducibility and transparency of the associated systematic mapping study, as well as to facilitate further research on deep learning-based DDoS detection in programmable networks.
Files
Steps to reproduce
1. Execute the search query in Scopus and IEEE Xplore to retrieve publications related to deep learning-based DDoS detection in SDN and 5G/B5G networks for the period 2018–2024. 2. Import to zotero, merge and deduplicate the retrieved records. 3. Apply inclusion and exclusion criteria to filter peer-reviewed articles and conference papers relevant to the study scope. 4. Export bibliographic metadata (title, authors, year, venue, abstract) from both databases. 5. Apply a semi-automated annotation process using a Python script to extract initial features from titles, abstracts, and manual tags. 6. Perform manual validation and refinement of the extracted attributes, including learning_type, dl_architecture, network_context, dataset_used, and evaluation_setting. 7. Mark fields as "Not specified" where relevant information is not explicitly available in the title or abstract. 8. Compile the final dataset in structured tabular format (CSV/Excel). The full methodology and classification criteria are described in the accompanying manuscript currently under submission. Due to differences in database indexing and manual interpretation, exact replication of the dataset may vary; however, the methodology allows for consistent approximation of the study selection and classification process.