Data for: Genotyping-by-sequencing and ecological niche modeling illuminate phylogeography, admixture, and Pleistocene range dynamics in quaking aspen (Populus tremuloides)
Description of this data
In support of the manuscript by Bagley et al. (in review; see below) on quaking aspen phylogeography and ecological niche modeling (ENM), this accession provides 1) the in-house laboratory protocol used to extract DNA from aspen leaf tissues (modified from Strauss Lab); 2) code used to conduct independent runs of the TASSEL-GBSv2 SNP discovery pipeline (Glaubitz et al. 2014) on our final (combined) genotyping-by-sequencing (GBS) dataset; 3) resulting SNP variant files from TASSEL-GBSv2 and final filtered variant call data files used during our genomic analyses; and 4) unfiltered vs filtered species occurrence data files and computer code used during our ENM analyses of our focal taxon, Populus tremuloides.
Bagley, J. C., Heming, N. M., Gutiérrez, E. E., Devisetty, U. K., Mock, K. E., Eckert, A. J., & Strauss, S. H. (in review). Genotyping-by-sequencing and ecological niche modeling illuminate phylogeography, admixture, and Pleistocene range dynamics in quaking aspen (Populus tremuloides). Molecular Ecology.
Glaubitz, J. C., Casstevens, T. M., Lu, F., Harriman, J., Elshire, R. J., Sun, Q., & Buckler, E. S. (2014). TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLoS One, 9(2): e90346
Experiment data files
Steps to reproduce
Molecular laboratory methods:
- Conduct DNA extraction from leaf tissues as described in the enclosed in-house protocol
- Follow other methods for sequencing and dataset construction listed in the Materials and Methods section of the manuscript
Genomic data analyses involved:
- Preparing code and data files, installing requisite software, and setting up directory structure on local machines (Mac) and high-performance supercomputing cluster (Linux)
- Running the SNP discovery pipeline TASSEL-GBSv2 on our final dataset (raw sequence files from our GBS experiment and from plates sequenced by Schilling et al. 2014; see text for details)
- Conducting various phylogenomic and population genetic analyses in TreeMix and R
- Plotting results, exploring the data/results, and conducting statistical analyses in R
Ecological niche modeling analyses involved:
- Preparing code and installing software on local machines (Mac)
- Preparing the environmental data in R
- Preparing the occurrence data in R
- Preparing minimum convex polygons (MCPs) for the full-species and cluster datasets, and a minimum concave polygon (MCcP) for the full-species occurrence data
- Extracting cluster coordinates from within MCP-based calibration areas (see text, Rscripts, Appendix S1, and Data S2 for details)
- Tuning MaxEnt model parameters (FCs and RM) using ENMevaluate function of ENMeval
- Running final ENMs on the species and cluster datasets, using parameters selected in ENMeval
- Projecting the final ENMs onto different climate scenarios (time-slices)
- Plotting results and calculating metrics describing the models
As mentioned in the text and shown in the Rscript files, R analyses largely relied upon the R packages raster, ENMwizard, and ENMeval. For the ENM analyses, only the occurrences and Rscripts are necessary to replicate our results with climate/paleoclimate data layers, because the results files and other analysis files (calibration area shapefiles) are all generated by the Rscripts.
Other information necessary for reproducing our analyses is provided in the main text, Appendix S1, Data S1 and S2 files of the Supporting Information, and the README file in this accession.
Cite this dataset
Bagley, Justin; Heming, Neander; Gutierrez, Eliecer (2018), “Data for: Genotyping-by-sequencing and ecological niche modeling illuminate phylogeography, admixture, and Pleistocene range dynamics in quaking aspen (Populus tremuloides)”, Mendeley Data, v1 http://dx.doi.org/10.17632/jhkhvdgyfy.1
The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.