Bat Echolocation Call Analysis with Deep Learning Models
This code project contains a multitude of scripts often required for understanding bat echolocation call data. Basic scripts in the data preparation are provided for: - cleaning audio data - processing audio data into linear spectrograms or Mel-Frequency-Cepstral-Coefficients (MFCC) - creating unique IDs for each pair of image and corresponding label The optimization core contains the following functionalities: - basic but highly customizable training and validation functions for neural network models - performance metrics to evaluate models for various tasks, binary and multi-class, respectively - logger classes to track various values and parameters The analysis part is divided into an unsupervised learning approach and a supervised learning approach. First, with the help of unsupervised methods like autoencoders and clustering algorithms like UMAP, the datasets can be analysed for general geometric properties. The convolutional autoencoder is used to learn an efficient representation of the data and feed the UMAP clustering algorithm with highly compressed latent feature vectors for faster convergence. The clustering can reveal similarities between specific classes then and guide the further analysis with supervised learning methods by adjusting the composition of classes and dataset sources before deciding to train a supervised neural network model. Dataset sources encompass meta data like year/time of measurement and location/position of measurement devices. The used dataset is confidential and this code project is hand-tailored to this dataset. Nevertheless, all concepts used in this project are applicable to most animal call datasets that provide labeled audio data. Thus, this code project can be a helpful template for future research in animal acoustic analysis and specifically for bat echolocation call data.
Steps to reproduce
Readme file is provided. See "README.md" in project folder for more information.