Mitochondrial genome assembly and comparative mitogenomics of five snake mackerels (Perciformes, Gempylidae).

Published: 9 February 2023| Version 1 | DOI: 10.17632/z5vvwvjhzx.1
Siphesihle Mthethwa,
Aletta Bester-van der Merwe,
rouvay roodt-wilding


The Gempylidae family, also known as snake mackerels, is a large and diverse suborder of fishes in the Perciformes. There are 24 species in the family, which are divided into 16 genera. A number of these species are economically important. Despite substantial research on this family using morphology-based and genetic methods, taxonomic categorization in this group remains a mystery. In this study, we demonstrate the utility of mitogenomes in resolving complicated taxonomic relationships in the gempylids. Using the next-generation sequencing (NGS) platform Ion-Torrent, we characterized the entire mitogenomes of Neoepinnula minetomai (Nakayama, Kimura, and Endo, 2014), Neoepinnula orientalis (Gilchrist & von Bonde, 1924), Rexea antefurcata (Parin, 1989), Rexea prometheoides (Bleeker, 1856), and Thyrsites atun (Euphrasen, 1791). Then, we conducted comparative analyses examining codon usage, nucleotide composition, gene content, and gene-order arrangement. We also looked for signs of selection in the mitogenomes. Finally, we constructed a phylogenetic trees using Bayesian Inference and Maximum-Likelihood methods and estimated time since divergence of the Gempylidae family using 13 protein-coding genes. The input data (and output) for molecular divergence dating, maximum-likelihood tree inference, and Bayesian inference, as well as input data for selection analyses, are all included in this publication.


Steps to reproduce

1. The tissue samples used in the current study were donated by local and international government fisheries as well as international research institutions. 2. Genomic DNA was extracted using the CTAB protocol and sent to the Central Analytical Facility of Stellenbosch University for sequencing using the Ion-Torrent S5 next-generation platform. 3. After quality filtering the data (removing adapters, reads of poor quality, and shorter reads) raw reads were used for mitogenome assembly in the Geneious software. All mitogenomes except for T. atun (which required a combination of both the de novo and map-to-reference assembly methods) were mapped to their closely related sister taxa i.e. species within the same genus (sequences that were already available through NCBI's GenBank). 4. The newly assembled mitogenomes were annotated using MitoAnnotator and confirmed with the MITOs gene annotation pipeline. 5. Gene content and order were visually inspected. The nucleotide composition, codon bias and relative synonymous codon usage were estimated using MEGA software. 6. Signatures of selection were investigated for for each gene region (using DNASP software) and positively selected sites were investigated in completed mitogenomes of the Gempylids (using EasyCodeML software) 7. The 13 protein-coding genes excluding their stop codons were aligned separately (and a substitution model estimated for each gene) and then concatenated, giving an alignment of 11,400-bp. This full data set was then used to construct phylogenetic trees using Bayesian inference and Maximum-likelihood approaches. Phylogenetic trees were also constructed for individual genes and their resolution power was compared to that of the full mitogenome. 8. For molecular divergence dating the full data set was partitioned to codon positions (1,2, and 3) and ModelFinder was used to find the appropriate partitioning schemes that best represent the full data set. The analysis was carried out using the Calibrated-Yule tree model together with an uncorrelated relaxed molecular clock model.


Stellenbosch University Department of Genetics


Input/Output, Mitochondrion, Phylogenetic Tree, Biomolecule Characterization


National Research Foundation