DNS Exfiltration Dataset

Published: 8 November 2022| Version 2 | DOI: 10.17632/c4n7fckkz3.2
Kristijan Ziza,
Pavle Vuletić,
Predrag Tadić


DNS exfiltration dataset was recorded in a realistic network environment. More than 50 million DNS requests were recorded on one of the ISP's DNS servers. The data in the dataset was anonymised by changing all IP addresses using injective mapping. Features in the dataset are split into single request and aggregate features. Single request or DNS label-based features can be calculated for each DNS request independently using only the textual characteristics of the request. On the other hand, aggregate features are calculated using multiple subsequent request from one client to a particular TLD. This reduces the size of the dataset to about 35 million records. The complete list of features with descriptions can be found in dataset_description.txt file. For all of the features which are based on finding English words in the request we used about 60.000 most commom English words. The list of used words can be found in english_words.txt. The main dataset (dataset.csv) contains regular requests and exfiltrations performed using DNSExfiltrator and Iodine tools. Additional dataset (dataset_modified.csv) contains only exfiltrations executed with modified DNSExfiltrator tool. Waiting times between two consecutive requests in this dataset are randomised and the requests also have lower entropy causing the detection to be much harder.



Univerzitet u Beogradu


Computer Network, Cybersecurity, Machine Learning