This is the 1st FPT Open Speech Data (FOSD) and Tacotron-2 -based Text-to-Speech Model Dataset for Vietnamese.
It comprises of:
- A configuration file in *.json format;
- Training and validation text input files (in *.csv format);
- A trained model (checkpoint file, after 225,000 steps);
- Sample generated audios from the trained model.
This dataset is useful for research related to TTS and its applications, text processing and especially TTS output optimization given a set of predefined input texts.
The procreative statistical framework of musical note structures produces a crucial role in multimedia music classification and reconstruction strategies. Another most significant thing for harmonious music composition is the rhythmic structures that provide musical performance in a harmonic form. This paper has illustrated computational music theory and allied factors to regulate what human beings can acquire, remember, and reconstruct music for sustaining intangible cultural and entertainment heritage. The music strings or symbols are also imperative factors that assist the musicians as performance guidelines. To afford a syntactic outline of musical note arrangements, a stochastic model along with probabilistic context-free music grammar has been illustrated in this paper. The state transition analysis has also been incorporated in terms of transition table and diagram to demonstrate which state can move to the other one within a finite automaton depending on the behaviors of the current state and associated transition rule. Petri net has been used for modeling and simulating the projected complex music composition framework to analyze system performances. The Petri net simulation-based reachability and system efficiency have been evaluated for analyzing the effectiveness of the proposed event-driven architecture. For incorporating real data into the projected framework, the music composition and reconstruction tool has also been demonstrated. The system performance evaluation metric has shown that around 92% efficiency level has been achieved by analyzing the projected music composition model.
This dataset consists of 348 non-zero onset Vietnamese speeches (with their transcripts and the labelled start and end times of each speech) extracted from approximately 30-hour of FPT Open Speech Data (released publicly in 2018 by FPT Corporation). The extraction process was done automatically by a Python program written by the contributor.
The speeches are in *.mp3 format and *.wav format (Mono, 48 kHz, 32-bit float) while the transcript file is in *.txt format with utf-8 encoding scheme.
The dataset is useful for any onset detection research and development since the start and end times of each speech are already labelled.
Copyright 2018 FPT Corporation
Permission is hereby granted, free of charge, non-exclusive, worldwide, irrevocable, to any person obtaining a copy of this data or software and associated documentation files (the “Data or Software”), to deal in the Data or Software without restriction, including without limitation the rights to use, copy, modify, remix, transform, merge, build upon, publish, distribute and redistribute, sublicense, and/or sell copies of the Data or Software, for any purpose, even commercially, and to permit persons to whom the Data or Software is furnished to do so, subject to the following conditions:
The above copyright notice, and this permission notice, and indication of any modification to the Data or Software, shall be included in all copies or substantial portions of the Data or Software.
THE DATA OR SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE DATA OR SOFTWARE OR THE USE OR OTHER DEALINGS IN THE DATA OR SOFTWARE.
Patent and trademark rights are not licensed under this FPT Public License.
This data includes all the input data for the test instances used in the experiments. The input data consists of three sets of benchmarks including OR-Lib set (40 graphs), TSP-Lib set (20 graphs) and University of Florida Sparse Matrix Collection (3 graphs) .
This dataset is the complete directory of all Trygve's web pages. The web page HTML code is found from its URL.
For example, the HTML for
is in the file at
The University of Oslo is terminating its Web service after 25 years of operation. My gigabyte of web pages have been collected over the years and will no longer be accessible over the Net. The pages are stored in this dataset and it may be possible to transfer them to another service if required. It should in any case be possible to read the dataset with an HTML reader.
This folder contains BatClassify results files used to create Temporal Pass Plots in the accompanying article Fig. 2 (Richmond_Myotis_BatClassify_Results.csv), Fig. 3a (Richmond_Ppyg_Site_A_BatClassify_Results.csv) and Fig. 3b (Richmond_Ppyg_Site_B_BatClassify_Results.csv). All other files are used in the TPP vignette, which is provided in the supplementary material.
Contributors:Beason Richard, Riesch Rüdiger , Koricheva Julia
Temporal activity patterns can potentially reveal useful information about behaviour, phenological changes and emergence times for bat species; however, detailed assessments of temporal activity are infrequently performed or published for bats. Passive electronic devices, such as autonomous recording units and camera traps, are increasingly being used as a means of monitoring various species, communities and habitats. Data recorded by these devices inherently contain file metadata detailing the dates and times when data capture took place. We have utilised this metadata to create the Temporal Pass Plot (TPP), which provides intuitive, yet highly detailed, visualisations of temporal bat activity over prolonged periods of time. Furthermore, TPPs are produced using a common scale based upon activity within predetermined time-blocks, enabling direct comparisons between different sites and species to be performed.
TPPs reveal inter- and intra-specific differences, and seasonal changes, in temporal activity. As a relatively untapped area of research, further study is required to evaluate associations between activity patterns and different behaviours (e.g. roosting, commuting and swarming). However, if this can be achieved, the scope of assessments that could be performed with passive monitoring technologies could be significantly expanded, enabling more detailed evaluations of habitat use to be performed with minimal disturbance to the target species. Although the TPP was principally designed for the purpose of studying bat activity, it can easily be adapted for other species that can are monitored using autonomous recording devices.
Article data: This folder contains the data files used to create all three Temporal Pass Plots shown in the main article.
Vignette data: This folder contains all files described in the TPP vignette, which is provided in the supplementary materials of the main article.
Origin graph files of the manuscript LiFePO4_S cathode proof.
Figure 1 Cyclic Voltammetry of LiFePO4-S composite with LiPF6 electrolyte.
Figure 2 Cyclic Voltammetry of LiFePO4-S composite with LiTFSI electrolyte.
Figure 3 dQ/dV curves calculated from charge/discharge cycling data of the LiFePO4-S composite cathode with LiTFSI electrolyte: Lithiation in cycle 3 (a) and cycle 5 (b), and delithiation in cycle 3 (c) and cycle 5 (d).
Figure 5 XRD pattern of hydrothermal carbon - LiFePO4 composite prepared in acetic acid. The pattern of LiFePO4 without carbon coating is shown as a reference. * Peak corresponding to graphitic carbon.
Figure 6 Raman spectrum of hydrothermal carbon-LiFePO4 composite after thermal treatment at 650 °C. The Raman spectrum of LiFePO4 reagent without carbon coating is shown as a reference.
Figure 7 dQ/dV curves calculated from charge/discharge cycling data of composite cathodes of carbon-coated LiFePO4 (prepared in acetic acid) and sulfur infiltrated in porous carbon: Lithiation in cycle 3 (a) and cycle 5 (b), and delithiation in cycle 3 (c) and cycle 5 (d).
Contributors:Zago Mattia, Gil Pérez Manuel, Gregorio Martinez Perez
In computer security, network botnets still represent a major cyber threat. Concealing techniques such as the dynamic addressing and the Domain Name Generation Algorithms (DGAs) require an improved and more effective detection process. To this extent, this data descriptor presents a collection of over 30 million manually-labelled algorithmically generated domain names decorated with a feature set ready-to-use for Machine Learning analysis. This proposed data set enables researchers to move forward the data collection, organization and pre-processing phases, eventually enabling them to focus on the analysis and the production of Machine-Learning powered solutions for network intrusion detection.
To be as exhaustive as possible, 50 among the most important malware variants have been selected. Each family is available both as list of domains and as collection of features. To be more precise, the former is generated by executing the malware DGAs in a controlled environment with fixed parameters, while the latter is generated by extracting a combination of statistical and Natural Language Processing (NLP) metrics.