The complete mitochondrial genome of the South African snoek Thyrsites atun (Euphrasén, 1791) (Perciformes, Gempylidae)

Published: 26 October 2022| Version 2 | DOI: 10.17632/2ky749tzw9.2
Siphesihle Mthethwa,
, rouvay roodt-wilding


Supplementary data for snoek mitochondrial genome announcement publication. The data includes a table defining characteristics of the complete circular mitogenome, a table with accession numbers for sequences used in Bayesian inference. The input nexus file and output consensus tree.


Steps to reproduce

1. Sample collection and DNA extraction Genomic DNA was extracted from the tissue (fin clip from a snoek individual caught in St Helena Bay, South Africa: -32°44’29.5” S 18°01’12.2” E in November 2010) using the CTAB extraction protocol. 2. Library construction and Ion Torrent sequencing Sequencing was carried out at the Central Analytical Facility of Stellenbosch University, South Africa. Before library construction, gDNA concentration was quantified on the Qubit TM 4 Fluorometer, and the purity was determined using the NanoDrop™ ND-2000 spectrophotometer. Genome Quality Scores were determined by electrophoresis using the PerkinElmer® LabChip GX Touch 24 Nucleic Acid Analyzer. Library preparation was performed using the Ion Plus Fragment Library Kit according to the manufacturer’s protocol, Ion Xpress™ Plus gDNA Fragment Library Preparation User Guide. Following library preparation, template DNA was enriched using the Ion 530™ Chef Kit. Sequencing was carried out on the Ion GeneStudio S5 Prime System. Post sequencing, raw reads were trimmed to remove the 3’ adaptor sequence in Torrent Suite v5.12.0. Further trimming using a 30 bp sliding window and a threshold value of 16 was performed. This process was repeated until the average QV for the last 30 bases was higher than 16. If the read was trimmed to less than 25 bases the whole read was removed from the dataset. 3. Mitochondrial genome assembly and annotation Raw reads were mapped to an unpublished complete mitogenome of T. atun sampled from New Zealand (from a related study) using Geneious Prime v2020.2.5 utilizing the Geneious mapper with medium/low sensitivity and fine-tuning up to five iterations, followed by manual curation. Genome annotation was performed with the help of MitoAnnotator and confirmed with the MITOS genome annotation pipeline using the vertebrate mitochondria genetic code. Annotations were performed in Geneious Prime . 4. Gempylidae phylogeny Thirteen PCG sequences excluding stop codons were aligned separately on ClustalX2. The resulting alignments were optimized manually in BioEdit v7.2.5. The best-fitting model for nucleotide substitution was determined in jModelTest v2.1.10 according to the Bayesian information criterion. Following model testing, the thirteen alignments were concatenated in the order in which they appeared in the mitogenome. Bayesian Inference was performed in MrBayes v3.2.7 using moderately gene-partitioned data. The Markov Chain Monte Carlo analysis was run for 20 000 000 generations with sampling at every 1 000 generations. Twenty-five percent of starting trees were discarded as burn-in while the remaining trees were used to estimate the consensus tree (50% majority rule) and the Bayesian Posterior Probabilities. To ensure that stationarity had been reached, the Effective Sample Size (ESS) for all sampling parameters was checked in Tracer v1.7 (Rambaut et al., 2018). The resulting tree from the BI was visualized and edited in FigTree v1.4.4.


Stellenbosch University


Molecular Phylogenetics, Biomolecule Characterization