NoSQLiM811: MongoDB Injection Detection Dataset

Published: 15 May 2026| Version 1 | DOI: 10.17632/d3kc8mm247.1
Contributors:
,

Description

NoSQLiM811 is a balanced and systematically curated dataset of 1,600 MongoDB queries, consisting of 800 benign and 800 malicious examples created to support research on NoSQL injection detection and related security analysis tasks. The queries were generated and validated in a controlled MongoDB 8.0.11 environment using a consistent document structure and cover all CRUD operations across multiple collections. Malicious samples span a broad range of NoSQL injection techniques, including tautologies, union‑based injections, JavaScript injections, piggybacked queries, blind injections, and time‑based variants, while benign queries reflect realistic operational behavior. All queries are provided as raw MongoDB strings with a binary label indicating benign (0) or malicious (1), stored in a single CSV file for easy integration into machine learning workflows, anomaly detection, and the development of NoSQL injection detection techniques. The dataset is intended as a reusable resource for developing, testing, and comparing NoSQL security models and can be extended to incorporate additional attack patterns.

Files

Institutions

Categories

Database

Licence