Click and Pulse Recognition for Marine Species Classification
Description
This dataset supports the manuscript "Methodology for Click and Short Duration Pulses Recognition as a Dual Factor in the Classification of Marine Species." The research hypothesis is that impulsive acoustic clicks in continuous passive acoustic monitoring (PAM) recordings can be automatically detected and classified as biological or non-biological/anthropogenic, using unsupervised anomaly detection (Isolation Forest) combined with periodicity and spectral descriptors (inter-click interval coefficient of variation, autocorrelation, bandwidth, and duration). The dataset provides the raw and processed audio recordings, together with the full quantitative acoustic analysis, underlying the manuscript's results across seven acoustic sources: snapping shrimp (Alpheidae), common bottlenose dolphin (Tursiops truncatus), killer whale (Orcinus orca), Gobius cobitis, stingray (Urogymnus granulatus), rig shark (Mustelus lenticulatus), and vessel (propeller) noise. Recordings were obtained with several hydrophone systems at the field sites and laboratory facilities described in the main article, corresponding to segments with confirmed species activity as identified in the original source publications. Contents main.pdf: a pdf copy of the Preprint submitted. supp_material.pdf: complete acoustic analysis per recording, including multidimensional plots of detected events (temporal distribution by energy, ICI histograms, duration distributions, per-click frequency ranges), statistical distributions (violin/box plots) of duration, RMS amplitude, bandwidth and ICI, cumulative time-series analysis (event count, rolling energy, spectral centroid), and detection-performance metrics (confusion matrix, ROC) for recordings not covered as case studies in the main article. audios/original/: raw, unprocessed WAV recordings as acquired in the field or laboratory. audios/filtered/: the same segments after Isolation Forest-based click/pulse isolation, retaining only the impulsive events identified as anomalies relative to background noise. Key findings The detector achieves high sensitivity for acoustically well-defined sources, while performance degrades when a biological source temporally overlaps with anthropogenic transients of similar spectral profile, as seen in the orca case (FPR = 0.850). The periodicity-based classification scheme reliably separates biological echolocation, stereotyped biological, and anthropogenic sources at the extremes of the parameter space; some taxa (snapping shrimp, stingrays, Mustelus lenticulatus) remain acoustically ambiguous, reflecting the current scarcity of reference bioacoustic data rather than a failure of the method. Interpretation and use Amplitude and energy values are dimensionless, computed directly from raw WAV samples without absolute sensitivity calibration; they support within-recording and same-equipment comparisons but are not comparable to sound pressure levels across different hydrophones or gain settings.
Files
Steps to reproduce
Audio recordings come from five PAM case studies (snapping shrimp, dolphins/orcas, Gobius cobitis, stingrays, and Mustelus lenticulatus), collected with different hydrophones (digitalHyd SR-1, SoundTrap 300HF/ST300HF, HydroMoth) at field sites and one laboratory setting, using only segments with confirmed species activity from the original source publications. Each recording was preprocessed (normalization, DC-offset removal, windowing) and analyzed using FFT and STFT spectrograms. Clicks and pulses were then automatically detected with an unsupervised Isolation Forest algorithm, trained on background noise segments to flag anomalous impulsive events without labeled data. For each detected event, descriptors were extracted (duration, RMS amplitude, energy, frequency range, bandwidth, inter-click interval). Events were classified as biological echolocation, biological stereotyped, or anthropogenic using a rule based on ICI variability, autocorrelation, and bandwidth, with confidence intervals from bootstrap resampling. Detector performance was assessed per case study using confusion matrices and ROC operating points. The full pipeline was implemented in Python (NumPy, SciPy, scikit-learn) with custom scripts; the supplementary PDF and all figures were generated directly from this analysis. The original and filtered audio folders provide the exact inputs and outputs of the pipeline, allowing others to reproduce the detection and reuse the method on new recordings.
Institutions
- Universidad de OviedoAsturias, Oviedo
- Woods Hole Oceanographic InstitutionMassachusetts, Falmouth