Data for: DNA barcode and minibarcode identification of freshwater fishes from Cerrado headwater streams in central Brazil

Published: 24 March 2019| Version 1 | DOI: 10.17632/9pr3cpf33g.1
Contributor:
Justin Bagley

Description

Data accession in support of the manuscript by Bagley et al. (in revision) on DNA barcoding headwater stream freshwater fishes of the Brazilian Distrito Federal (DF) and surrounding areas of the Brazilian Shield in central Brazil. This accession contains several folders of files, including data and code that will allow the user to recreate the analyses in the corresponding paper. Directories include Alignments/, PartitionFinder/, SpeciesIdentifier/, MrBayes/, and R/ subfolders. Contents are as follows: - Alignments/ contains a small set of aligned, edited DNA sequences in different alignment formats (NEXUS, PHYLIP). Alignments with "n149" in the name contain the full 'final' dataset analyzed in our study, with n = 149 DNA barcodes corresponding to specimens listed in Data S1 of the Supporting Information for the paper. The PHYLIP alignment with "n103" in the name contains the reduced "No-Characidium" dataset analyzed for the paper, from which Characidium individuals have been removed. - The PartitionFinder/ subfolder contains folders with files for conducting two separate runs of our dataset in PartitionFinder v2.1.1 (Lanfear et al. 2014), as described in the paper. - The SpeciesIdentifier/ subfolder contains input files formatted for SpeciesIdentifier v1.8 (Meier et al. 2006), which were used during SpeciesIdentifier analyses described in the paper. - MrBayes/ directory contains a run folder for one run of MrBayes conducted for the paper. A NEXUS file formatted for MrBayes v3.2+ and a batch submission file for queueing the run on a supercomputing cluster are provided. The file named "Mrbayes_sumtp_log.txt" provides the output of running parameter and tree summaries, to summarize the posterior results of the run, using the 'sumt' and 'sump' functions in MrBayes. - The R/ subfolder contains several DNA sequence input files and an R script named "CerradoFish_cox1_n149_DNA_Barcoding_R_Analysis.R" that runs a variety of DNA barcoding-related analyses in R while drawing on the sequence files. The "seqs2" / "fishSpp2" objects are the main objects used when analyzing the full dataset (n = 149 barcodes). The "seqs3" / "fishSpp3" objects are the main objects used when analyzing the No-Characidium dataset (n = 103 barcodes). Running the R script produces many outputs and figures used or modified for the manuscript. REFERENCES Bagley, J. C., De Podestà Uchôa de Aquino, P., Breitman, M. F., Langeani, F., & Colli, G. R. (in revision) DNA barcode and minibarcode identification of freshwater fishes from Cerrado headwater streams in central Brazil. Journal of Fish Biology. Lanfear, R., Calcott, B., Kainer, D., Mayer, C., & Stamatakis, A. (2014) Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evolutionary Biology 14, 82. Meier, R., Shiyang, K., Vaidya, G., & Ng, P. K. L. (2006) DNA barcoding and taxonomy in Diptera: a tale of high intraspecific variability and low identification success. Systematic Biology 55, 715–728.

Files

Institutions

Universidade de Brasilia, Virginia Commonwealth University

Categories

Systematics, Freshwater Fish, DNA Sequencing, Mitochondrial DNA, Cytochrome C Oxidase, Genetic Distance, Brazil, Molecular Phylogenetics, DNA Barcoding

Licence