Data for: Genotyping-by-sequencing and ecological niche modeling illuminate phylogeography, admixture, and Pleistocene range dynamics in quaking aspen (Populus tremuloides)

Published: 24 Aug 2018 | Version 1 | DOI: 10.17632/jhkhvdgyfy.1

Description of this data

In support of the manuscript by Bagley et al. (in review; see below) on quaking aspen phylogeography and ecological niche modeling (ENM), this accession provides 1) the in-house laboratory protocol used to extract DNA from aspen leaf tissues (modified from Strauss Lab); 2) code used to conduct independent runs of the TASSEL-GBSv2 SNP discovery pipeline (Glaubitz et al. 2014) on our final (combined) genotyping-by-sequencing (GBS) dataset; 3) resulting SNP variant files from TASSEL-GBSv2 and final filtered variant call data files used during our genomic analyses; and 4) unfiltered vs filtered species occurrence data files and computer code used during our ENM analyses of our focal taxon, Populus tremuloides.

Bagley, J. C., Heming, N. M., Gutiérrez, E. E., Devisetty, U. K., Mock, K. E., Eckert, A. J., & Strauss, S. H. (in review). Genotyping-by-sequencing and ecological niche modeling illuminate phylogeography, admixture, and Pleistocene range dynamics in quaking aspen (Populus tremuloides). Molecular Ecology.

Glaubitz, J. C., Casstevens, T. M., Lu, F., Harriman, J., Elshire, R. J., Sun, Q., & Buckler, E. S. (2014). TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLoS One, 9(2): e90346

Experiment data files

Steps to reproduce


Molecular laboratory methods:

  • Conduct DNA extraction from leaf tissues as described in the enclosed in-house protocol
  • Follow other methods for sequencing and dataset construction listed in the Materials and Methods section of the manuscript

Genomic data analyses involved:

  • Preparing code and data files, installing requisite software, and setting up directory structure on local machines (Mac) and high-performance supercomputing cluster (Linux)
  • Running the SNP discovery pipeline TASSEL-GBSv2 on our final dataset (raw sequence files from our GBS experiment and from plates sequenced by Schilling et al. 2014; see text for details)
  • Conducting various phylogenomic and population genetic analyses in TreeMix and R
  • Plotting results, exploring the data/results, and conducting statistical analyses in R

Ecological niche modeling analyses involved:

  • Preparing code and installing software on local machines (Mac)
  • Preparing the environmental data in R
  • Preparing the occurrence data in R
  • Preparing minimum convex polygons (MCPs) for the full-species and cluster datasets, and a minimum concave polygon (MCcP) for the full-species occurrence data
  • Extracting cluster coordinates from within MCP-based calibration areas (see text, Rscripts, Appendix S1, and Data S2 for details)
  • Tuning MaxEnt model parameters (FCs and RM) using ENMevaluate function of ENMeval
  • Running final ENMs on the species and cluster datasets, using parameters selected in ENMeval
  • Projecting the final ENMs onto different climate scenarios (time-slices)
  • Plotting results and calculating metrics describing the models

As mentioned in the text and shown in the Rscript files, R analyses largely relied upon the R packages raster, ENMwizard, and ENMeval. For the ENM analyses, only the occurrences and Rscripts are necessary to replicate our results with climate/paleoclimate data layers, because the results files and other analysis files (calibration area shapefiles) are all generated by the Rscripts.

Other information necessary for reproducing our analyses is provided in the main text, Appendix S1, Data S1 and S2 files of the Supporting Information, and the README file in this accession.

Latest version


Views: 2599
Downloads: 828

Previous versions

  • Version 1


    Published: 2018-08-24

    DOI: 10.17632/jhkhvdgyfy.1

    Cite this dataset

    Bagley, Justin; Heming, Neander; Gutierrez, Eliecer (2018), “Data for: Genotyping-by-sequencing and ecological niche modeling illuminate phylogeography, admixture, and Pleistocene range dynamics in quaking aspen (Populus tremuloides)”, Mendeley Data, v1

Compare to version


Universidade de Brasilia, Utah State University, Virginia Commonwealth University, Oregon State University, Universidade Federal de Santa Maria


Population Genetics, Plant Population Genetics, Molecular Plant Phylogeography, Geographic Information Systems, Data Analysis, Computer Software, Niche Modelling, Species Distribution Model, Molecular Ecology, Phylogeography


CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?
You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.