Hornet 40: Network Dataset of Geographically Placed Honeypots

Published: 27 July 2021| Version 3 | DOI: 10.17632/tcfzkbpw46.3
Veronica Valeros


Hornet 40 is a dataset of 40 days of network traffic attacks captured in cloud servers used as honeypots to help understand how geography may impact the inflow of network attacks. The honeypots are located in eight different cities: Amsterdam, London, Frankfurt, San Francisco, New York, Singapore, Toronto, Bangalore. The data was captured in April, May, and June 2021. The eight cloud servers were created and configured simultaneously following identical instructions. The network capture was performed using the Argus network monitoring tool in each cloud server. The cloud servers had only one service running (SSH on a non-standard port) and were fully dedicated as a honeypot. No honeypot software was used in this dataset. The dataset consists of eight scenarios, one for each geographically located cloud server. Each scenario contains bidirectional NetFlow files in the following format: - hornet40-biargus.tar.gz: all scenarios with bidirectional NetFlow files in Argus binary format; - hornet40-netflow-v5.tar.gz: all scenarios with bidirectional NetFlow v5 files in CSV format; - hornet40-netflow-extended.tar.gz: all scenarios with bidirectional NetFlows files in CSV format containing all features provided by Argus. - hornet40-full.tar.gz: download all the data (biargus, NetFlow v5, and extended NetFlows)


Steps to reproduce

This dataset used cloud server instances from Digital Ocean. For this dataset all cloud servers have the same technical configurations: a) Operating System: Ubuntu 20.04LTS, b) Instance Capacity: 1GB / 1 Intel CPU, c) Instance Storage: 25 GB NVMe SSDs, d) Instance Transfer: 1000 GB transfer. Once the cloud instances were created the servers were configured simultaneously using the parallel-ssh and parallel-scp tools: i. Update the software repository: apt update ii. Install Argus: apt install -yq argus-client argus-server iii. Upload common SSH configuration with SSH on a non-standard port to each server /etc/ssh/sshd_config iv. Restart SSH servers: /etc/init.d/ssh restart v. Upload common Argus configuration to each server at /etc/argus.conf vi. Start Argus server: argus -F /etc/argus.conf -i eth0 vii. Create a folder to store the NetFlow files: mkdir /root/dataset viii. Start rasplit to store the network data received by Argus: rasplit -S -M time 1h -w /root/dataset/%Y/%m/%d/do-sensor.%H.%M.%S.biargus SSH Configuration: AcceptEnv LANG LC_* ChallengeResponseAuthentication no Include /etc/ssh/sshd_config.d/*.conf PasswordAuthentication no PermitRootLogin yes Port 902 PrintMotd no Subsystem sftp /usr/lib/openssh/sftp-server UsePAM yes X11Forwarding yes Argus Configuration: ARGUS_FLOW_TYPE="Bidirectional" ARGUS_FLOW_KEY="CLASSIC_5_TUPLE" ARGUS_ACCESS_PORT=900 ARGUS_INTERFACE=eth0 ARGUS_FLOW_STATUS_INTERVAL=3600 ARGUS_MAR_STATUS_INTERVAL=60 ARGUS_GENERATE_RESPONSE_TIME_DATA=yes ARGUS_GENERATE_PACKET_SIZE=yes ARGUS_GENERATE_JITTER_DATA=yes ARGUS_GENERATE_MAC_DATA=yes ARGUS_GENERATE_APPBYTE_METRIC=yes ARGUS_GENERATE_TCP_PERF_METRIC=yes ARGUS_GENERATE_BIDIRECTIONAL_TIMESTAMPS=yes ARGUS_CAPTURE_DATA_LEN=480 ARGUS_BIND_IP="::1," Ra configuration: RA_PRINT_LABELS=0 RA_FIELD_DELIMITER=',' RA_USEC_PRECISION=6 RA_PRINT_NAMES=0 RA_TIME_FORMAT="%Y/%m/%d %T.%f" RA_FIELD_SPECIFIER= srcid seq stime ltime dur sstime sltime sdur dstime dltime ddur srng drng trans flgs avgdur stddev mindur maxdur saddr dir daddr proto sport dport sco dco stos dtos sdsb ddsb sttl dttl shops dhops sipid dipid pkts spkts dpkts bytes sbytes dbytes appbytes sappbytes dappbytes load sload dload rate srate drate loss sloss dloss ploss sploss dploss senc denc smac dmac smpls dmpls svlan dvlan svid dvid svpri dvpri sintpkt dintpkt sintpktact dintpktact sintpktidl dintpktidl sintpktmax sintpktmin dintpktmax dintpktmin sintpktactmax sintpktactmin dintpktactmax dintpktactmin sintpktidlmax sintpktidlmin dintpktidlmax dintpktidlmin jit sjit djit jitact sjitact djitact jitidl sjitidl djitidl state deldur delstime delltime dspkts ddpkts dsbytes ddbytes pdspkts pddpkts pdsbytes pddbytes suser:1500 duser:1500 tcpext swin dwin jdelay ldelay bins binnum stcpb dtcpb tcprtt synack ackdat inode smaxsz sminsz dmaxsz dminsz