Barley exome VCF and chloroplast genomes

Published: 24 June 2024| Version 1 | DOI: 10.17632/f6bjrszvdh.1
Peter Civan, Agostino Fricano


These datasets relate to the paper 'Genetic erosion in domesticated barley and a hypothesis of a North African centre of diversity' (Civan et al., in preparation). The whealbi VCF file is a general-purpose diversity matrix constructed from the exome data produced by the Whealbi consortium. Redundant accessions (above an empirical threshold of IBS >0.985) were removed. The FASTA file is a multiple sequence alignment (MSA) of chloroplast genomes reconstructed from various sequencing datasets by mapping to the Morex chloroplast genome EF115541 (ncbi) used as a reference. The MSA was visually checked for misaligned regions and corrected manually where necessary (except homopolymers and microsatellites). The second inverted repeat has been removed prior to read mapping. The data can be used for phylogeographic inference. Note of caution - the alignment region 35,650-42,725 bp spanning the genes atpA, rps14, psaB and psaA contains multiple ambiguities due to a paralogous copy from the mitochondrial genome interfering with the assembly, and should be avoided in sequence analyses.



Genomics, Chloroplast DNA, Genetic Diversity, Barley