DroNER: Dataset for Drone Named Entity Recognition

Name: DroNER: Dataset for Drone Named Entity Recognition
Creator: Swardiantara Silalahi
Published: 2022-12-16T08:11:27.734Z
Keywords: Digital Forensic Technique, Drone (Aircraft)

Silalahi, Swardiantara; Ahmad, Tohari; Studiawan, Hudan

doi:10.17632/fwcjyc754h.1

DroNER: Dataset for Drone Named Entity Recognition

Published: 16 December 2022| Version 1 | DOI: 10.17632/fwcjyc754h.1

Contributors:

,

Description

The dataset is constructed using several drone images acquired from VTO Labs Drone Forensic Dataset [1]. The dataset's main objective is to attempt performing NER on the human-readable messages contained in the drone flight log files. Six entity types, i.e., component, action, issue, parameter, state, and function, are identified as the region of interest in the domain problem, which is then used to label the entities mentioned in a log message. The entity type identification is performed in the context of drone forensics, as the original intention of constructing this dataset is to build an information extraction model to help the forensic investigator pinpoint an incident-related log record. The NER dataset is annotated using consistent and contextual tagging to compare the effect of contextual tagging on the NER model's performance. Contextual tagging considers surrounding words and uses the longest span as the context to determine which entity type of a particular word belongs to. Contrarily, consistent tagging uses the shortest span as the context of a word within a sentence. The train and test set are split based on the drone models resulting in a proportion of 76:24 since the number of messages extracted from each drone image is uncontrollable.

Files

Steps to reproduce

The data is constructed from several drone images acquired from VTO Labs Drone Forensics Dataset. After collecting the flight logs and parsing the human-readable messages within every flight log file, six entity types are identified after carefully reading all the unique messages. Two annotation procedures, namely consistent and contextual tagging are defined and used to annotate the data. Finally, two datasets are ready to use to build a NER model to recognize entities mentioned in the drone flight log files.

Institutions

Institut Teknologi Sepuluh Nopember

Funding

PMDSU Scholarship from The Ministry of Education, Culture, Research and Technology, The Republic of Indonesia

1483/PKS/ITS/2022

DroNER: Dataset for Drone Named Entity Recognition

Description

Files

Steps to reproduce

Institutions

Categories

Funding

Related Links

Licence