SNP array data for a native European flat oyster (Ostrea edulis) population

Published: 27 June 2022| Version 2 | DOI: 10.17632/v7ppb2xr99.2
Carolina Penaloza,
Agustin Barria,
Athina Papadopoulou,
C Hoop,
Joanne Preston,
Matthew Green,
Luke Helmer,
Jacob Kean-Hammerson,
Jennifer C Nascimento-Schulze,
Diana Minardi,
Manu Gundappa,
Dan Macqueen,
John Hamilton,
Ross Houston,
Tim P Bean


Genotype data for a hatchery-derived European flat oyster population (n=840 individuals) with matching growth phenotype data (see link below). Four growth-related traits were measured in the experimental population: Body weight (BW, the weight of an individual oyster including the shell ), Shell length (SL, the maximum distance between the anterior and posterior margins), Shell height (SH, the maximum distance between the hinge to the furthermost edge), and Shell width (SW, the maximum distance at the thickest part of the two shell valves). Individual oysters were genotyped using the combined-species Affymetrix Axiom® oyster SNP-array. The genotype data is available in VCF format for markers passing quality control (QC) filters (see steps below).


Steps to reproduce

SNP genotypes were imported to the Axiom analysis Suite v4.0.3.3 software for QC assessment and genotype calling. Genotypes were generated using the default parameter settings for diploid species. Probes from the SNP array were mapped to the chromosome-level genome assembly of Ostrea edulis (GenBank GCA_023158985.1). QC was conducted using Plink v2.0. SNP variants with a call rate >95% and a minor allele frequency >0.05 were retained. Given that significant sub-clustering was detected in the data, a k-means clustering method was used to assign individuals into groups. Deviations from Hardy-Weinberg Equilibrium (HWE) were tested separately in each of the three genetic clusters identified by the analysis. SNP markers showing significant deviations (HWE p-value < 1e-10) in two of the three clusters were excluded from the analysis. Sample QC included removing individual oysters with a missingness above 5% and high heterozygosity (i.e. more than three median absolute deviations from median). The final dataset is comprised of 840 samples genotyped at 4,577 genome-wide SNPs.


University of Exeter, University of Southampton, The University of Edinburgh, Centre for Environment Fisheries and Aquaculture Science, University of Portsmouth


Aquaculture, Single Nucleotide Polymorphism, Oyster