H-Voice: Fake voice histograms (Imitation+DeepVoice)

Published: 31 January 2020| Version 4 | DOI: 10.17632/k47yd3m28w.4
Contributors:
Dora Maria Ballesteros L,
,

Description

This data set consists of (6672) histograms of original voice recordings and fake voice recordings obtained by Imitation [1, 2] and Deep Voice [3]. The histograms provided in this dataset can be used to train a machine learning system to classify original and fake voice recordings obtained with the imitation and Deep Voice algorithms. Each directory has the following composition: -- corrupted images have been fixed -- Training_fake: 2088 histograms of fake voice recordings (2016 with Imitation and with 72 Deep Voice) Training_original: 2020 histograms of original voice recordings Validation_fake: 864 histograms of fake voice recordings (all with Imitation) Validation_original: 864 histograms of original voice recordings External_test1: 760 histograms (380 original + 380 fake with Imitation) External_test2: 76 histograms (4 original + 72 fake with Deep Voice) References: [1] DM Ballesteros L, JM Moreno A. Highly transparent steganography model of speech signals using Efficient Wavelet Masking. Expert Systems with Applications 39 (10), 2012, 9141-9149, https://doi.org/10.1016/j.eswa.2012.02.066 [2] DM Ballesteros L, JM Moreno A. On the ability of adaptation of speech signals and data hiding, Expert Systems with Applications 39 (16), 2012, 12574-12579, https://doi.org/10.1016/j.eswa.2012.05.027 [3] S.O. Arik, M. Chrzanowski, A. Coates, G. Diamos, A. Gibiansky, Y. Kang, X. Li, J. Miller, A. Ng, J. Raiman, S. Sengupta, M. Shoeybi. Deep Voice: Real-time Neural Text-to-Speech. 2017. https://arxiv.org/abs/1702.07825

Files

Steps to reproduce

The histograms provided in this dataset can be used to train an machine learning system to classify original and fake voice recordings obtained with the Imitation and Deep Voice algorithms. A detailed description of this dasaset has been submitted to the journal Data in Brief, with the title "A dataset of histograms of original and fake voice recordings (H-Voice)".

Institutions

Universidad Militar Nueva Granada

Categories

Computer Vision, Speech Processing, Machine Learning

Licence