Tropical forest gunshot classification training audio dataset

Published: 9 March 2022| Version 3 | DOI: 10.17632/x48cwz364j.3
Lydia Katsis,


DATA SOURCE LOCATION Data were collected in tropical forest sites in central Belize. Data were recorded in Tapir Mountain Nature Reserve (TMNR) and the adjoining Pook’s Hill Reserve in Cayo District, Belize [17.150, -88.860] and Manatee Forest Reserve (MFR) and surrounding protected areas in Belize District, Belize [17.260, -88.490]. FOLDERS The folders contain audio files recorded between 2017 and 2021. The ‘Training data’ folder and the ‘Validation data’ folder contain two temporally distinct datasets, which can be used for model training and validation. The training folder consists of 80% of the total dataset, and the validation folder comprises the remaining 20%. Within each of these folders are two folders labelled ‘Gunshot’ and ‘Background’. FILES The folders contain 749 gunshot files and over 35,000 background files. The files are in Waveform Audio File Format (wav), and are each 4.09 seconds long. The first 8 alphanumeric characters of the file name corresponds to the UNIX hexadecimal timestamp of the time of recording. Some files contain additional alphanumeric characters after these initial 8 characters, which were used as unique identifying numbers during processing and do not convey any additional information.


Steps to reproduce

Data were collected using AudioMoth acoustic sensors (v 1.0.0 and v 1.1.1). AudioMoths were housed in either a rubber-sealed watertight plumbing tube or the official AudioMoth IPX7 waterproof case. In both cases a Schlegel acoustic vent was used. At 20 separate locations we attached AudioMoths to the nearest suitable tree at shoulder height and directed the sensor towards valley areas. Devices were left at their respective sites for 1-12 months. AudioMoths were configured with a custom firmware developed to detect and record gunshots in Belizean tropical forests (see Prince et al. 2019). Sounds were recorded as 4.09 s Waveform Audio File Format (wav) files, collected at a sample rate of 8 kHz. Additional recordings were created through controlled firing of shotguns in Pook’s Hill Reserve. Shotguns were shot during the day and night at distances of 0-1000 m from 13 continuously recording sensors. Gunshot events were manually extracted from the controlled firing at Pook’s Hill Reserve using listening and inspection of spectrograms. Background files were obtained by taking a random sample of 10,000 files from the two surveys (TMNR and MFR), and then listening to remove any gunshots or human voice. Gunshot files were extracted from these survey data through the use of a custom gunshot detection filter, followed by spectrogram inspection and listening to classify gunshots and false positives. False positives from this filter were also added to the background dataset. Training and validation of automated classification models requires separate, non-overlapping training and validation datasets. The training and validation data used to train the classification model in Katsis et al. (in press) comprised the ground-truthed data, as well as survey data collected in TMNR and MFR. For each of these datasets, we ordered the recordings from each category (background and gunshot) according to the time the recording was made. Recordings made in the first 80% of the sampling period for each category at each site comprised the training dataset, and the remaining recordings, from the last 20% of the sampling period, comprised the validation dataset. We partitioned the data in this way to reduce ‘leakage’ between the validation and training datasets, which would lead to validating the model on data it has already seen. See Katsis et al. (in press) for a detailed description of methodology.


University of Southampton, University of Oxford


Audio Recording, Acoustic Monitoring, Tropical Forest Ecosystem