Fine-scale Honduran Rhizophora mangle SNP data set
Description
IIB-RAD single-nucleotide polymorphism data set for the species Rhizophora mangle within a small-parcel of forest on the mangrove cay, Fort Cay, Roatan, Bay Islands, Honduras.
Files
Steps to reproduce
RAD-seq library preparation Libraries were prepared using a modified version of the Wang et al. (2012) restriction site associated DNA (RAD) protocol which utilizes type IIB restriction enzymes that cut both upstream and downstream of the enzyme’s target site, resulting in the production of RAD tags of uniform length. Briefly, approximately 50–100 ng of high-quality genomic DNA (thin bright band on gel, with no smearing) from each sample was digested with the enzyme BcgI (New England BioLabs, Ipswich, USA), producing uniform 36bp length fragments with random overhangs. Genomic digests were then ligated to a pair of partially double-stranded adaptors targeting a reduced subset of BcgI sites through a different reduction scheme depending on organism genome size. RAD tags were then amplified with sample-specific 5,6bp or 6,6bp dual-barcodes and Illumina adaptors. PCR products were visualized on a 2.0% agarose gel to verify the presence of the expected 160-170 bp target band (i.e., fragment, barcodes and adaptors included). Gel purification of the target band was carried out following protocols outlined in Guo et al. (2014). Amplification products were pooled at equimolar concentrations and sequenced on an Illumina HiSeq 3000 (San Diego, USA) at the Center for Genome Research and Biocomputing at Oregon State University. SNP calling and quality control Raw reads were downloaded from the Oregon State University online portal. Libraries for three trees failed to amplify. Successfully amplified libraries from the remaining 182 trees were processed using ipyrad v0.9.12 (Eaton and Overcast 2020) on the Smithsonian Institution High Performance Computing Cluster (https://doi.org/10.25572/SIHPC). The genome of R. apiculata (Xu et al. 2017), a close relative of R. mangle, was used as the reference genome. In ipyrad v0.9.12 (Eaton and Overcast 2020), all parameters were set to default, except for the following: data type = 2brad; restriction overhang = ‘TGCAG’; cluster threshold = 0.85; maximum barcodes mismatch = 0; filter adapters = 2; filter minimum trim length = 20; maximum alleles consent = 2; minimum samples per locus = 4; and, trim read = 0; and trim loci = 0. An initial panel of 113,626 SNPs was generated. Screening for null alleles, deviation from Hardy-Weinberg equilibrium and linkage disequilibrium were conducted in R Studio (R Core Team 2020) using the packages adegenet (Jombart and Ahmed 2011), poppr (Kamvar, Tabima, and Grünwald 2014), and genepop (Rousset 2008). After post filtering and quality control, a panel of 575 informative SNPs was identified.