HornBase - A Car Horns Dataset
Description
To provide data for machine learning models capable of classifying car horns in a traffic environment, we created HornBase, a dataset comprising 1080 audio files of exactly one-second duration each, manually labeled and balanced into two classes: horn and not horn. The audio excerpts were collected in ten specific situational scenarios in which the audio signals can be received in the car where the horn will be analyzed, as well as three different horn styles: a short honk, a long one, and an intermittent sequence of three consecutive short honks. For each possible audio segment, three temporal windows are cut, with the first containing the initial half of a horn, the second containing the entire horn, and the third containing the final half of a horn.
Files
Steps to reproduce
The data was collected in a controlled traffic situation, involving two vehicles, one emitting horns and the other receiving the horns. In the receiving vehicle, two smartphones were recording the audio while a route was covered for each defined scenario. Subsequently, the data underwent pre-processing, involving manual cutting of audio segments with horns and without horns, each with a specific one-second time window.