Genome Compositional Domains of Hymenopteran Insects and Representatives of Five Other Insect Orders

Published: 31 March 2022| Version 1 | DOI: 10.17632/nxgvjwds9f.1
Contributors:
,

Description

RESEARCH ABSTRACT The compositional structures of eukaryotic genomes have long been known for their non-uniformity. Molecular evolutionists have proposed theories and models to describe eukaryotic genome architecture. Genome sequencing of the European honey bee (Apis mellifera), a model for the biology and evolution of eusocial behavior, has revealed unusual genome compositional characteristics, including a low but heterogeneous GC content and a biased tendency of genes to be located in low GC regions. In this study, we sought to determine whether those striking features are common in eusocial Hymenopteran insect species. Using a recursive segmentation algorithm, we partitioned 26 insect genomes and compared their compositional domains. We found that the genomes of Apis species have the most heterogeneous GC content, and bimodal distribution in GC content is unique to Apis. We also recognized that Hymenopteran's genes tend to be located in low GC content. Though, this bias is only strong in eusocial bees and Indian jumping ant, Harpegnathos saltator. Additionally, our GO analysis suggests that genes in high GC content regions tend to serve important roles in biological regulation processes of honey bee. DATA DESCRIPTION Genome assemblies of 21 hymenopteran insects and representatives of five other insect orders were segmented into GC compositional domains by IsoPlotter 2.4 (Elhaik and Graur 2013) (https://code.google.com/archive/p/isoplotter/). Data files provided per genome assembly are named according to the rule: [species name]_[assembly name]_GC_content_domains.txt Files are saved in tab-delimited text format with 5 columns: 1. Scaffold or Chromosome Accession 2. Domain Start Coordinate 3. Domain End Coordinate 4. Domain Percent GC 5. Is homogeneous

Files

Steps to reproduce

IsoPlotter 2.4 (Elhaik and Graur 2013) (https://code.google.com/archive/p/isoplotter/) was used to segment each of the genomes into compositional domains.

Institutions

University of Missouri Columbia

Categories

Genomics

Licence