Microsatellite and SNP sequencing data of different populations of Litopenaeus vannamei
Description
Knowledge of the genetic information within the population is crucial for the stabilization of genetic diversity and for the strategic planning of subsequent selection processes. This study assesses the genetic diversity of six populations (two breeding populations and four introduced populations, totaling 180 individuals) of Litopenaeus vannamei based on genotyping data from 12 microsatellite markers and whole-genome resequencing. After resequencing and filtering, 14136203 loci were used for genetic diversity analysis. Genetic diversity parameters were calculated for six populations of L. vannamei based on these genetic data. In SSR analysis, population SIS diversity was the highest and XH-F diversity was the lowest among the six populations. In SNP analysis, the results indicate that the XH-F was under selection and has the lowest genetic diversity among these six populations, SIS heterozygosity level was the lowest, but the polymorphism was high, with more low-frequency loci. In population differentiation analysis, the results of the two markers were similar. The results of population differentiation analysis of the two markers showed that populations XH-F and SyAqua were highly differentiated from other populations. In the present study, both molecular markers were able to clearly classify different populations of L. vannamei. PCA analysis showed that there were obvious differences among different groups. In phylogenetic tree and ancestral estimation analysis, SSR can only classify six populations into three groups. At the individual level, SSR was more suitable for distinguishing genetic differences among individuals. When the K value increases, SSR may face limitations that prevent further subdivision of the population into six groups. In contrast, SNPs exhibit stronger discriminative ability and can clearly divide these six populations into six groups. The data and results generated from this study will contribute to enriching the genetic resources of L. vannamei and provide important reference information for its artificial breeding and genetic improvement. About this data, it is composed of microsatellite sequencing data (.fsa) and resequencing filtered data (VCF format).
Files
Steps to reproduce
DNA samples were obtained from four introduced populations and two breeding populations. From the above six populations, 30 shrimp were randomly selected from each population, with a total of 180 shrimp. Muscle samples of each individual were collected for DNA extraction. Genomic DNA was extracted using the MolPure Call/TIANamp Marine Animal DNA Kits (Yeasen Biotech Co., Ltd. Shanghai, China). About PCR amplification and SSR detection: The 25.0μL PCR reaction system consisted of template DNA (20-50 ng/μL)1.0μL, upper and downstream primers 0.5μL each, 10×Taq Buffer 2.5μL, dNTP (mix) 0.5μL, Taq enzyme 0.2μL, and ddH20 to make up 25μL. The amplification procedure was denaturation at 95℃ for 5min. 94℃ 30s, 60℃ 30s, 72℃ 30s, 10 cycles; 94℃ 30s, 55℃ 30s, 72℃ 30s, 30 cycles. Elongation was repaired at 72℃ for 10min. SSR detection of PCR products was performed by Sangon Biotech (Shanghai, China) Co., LTD. ABI 3730xl sequencer was used to obtain genotype information of each SSR locus, and genotyping results were read by GeneMapper software. About SNP sequencing and genotyping: An insert size of 300 to 500 bp small DNA fragment library was constructed, and then DNB (DNA nanoball) was loaded into the sequencing chip using the loading device MGIDL-T7 (The key auxiliary equipment of the design of the ultra-high throughput gene sequencer DNBSEQ-T7), and sequenced by the joint probe anchoring polymerization technology (Huazhi Rice Bio-Tech Co., Ltd., Changsha, China). Original image data files obtained by high-throughput sequencing were transformed into sequencing sequences (reads) by base recognition analysis (base calling). Low-quality bases and reads were trimmed by FASTP. Sentieon was used to align the data and detect variants. The reference genome was provided by our own sequencing assembly (GCF_042767895.1). To keep the most reliable SNPs for subsequent analysis, GATK was used to perform a preliminary hard screening of the SNPs obtained after the joint analysis, and the screening criteria were as follows: low quality by depth score (QD <5.0), high Fisher strand score (FS > 60.0); low mapping quality (MQ < 40.0); high strand odds ratio (SOR > 3.0); low mapping quality rank-sum score (MQRankSum < -12.5); and low read Position Rank Sum (ReadPosRankSum < -8.0). SNP loci with multiple alleles (max-alleles < 2), low minor allele frequency (MAF < 0.05), and missing genotypes (max-missing < 0.9) were also removed using the VCFtools (v0.1.16).
Institutions
Categories
Funding
The National Key Research and Development Program of China
2022YFD2400204
The Key Special Program on Science and Technology Innovation in Marine Agriculture and Freshwater Fisheries
2023YFD2401705
the Open Competition Program of the Top Nine Critical Priorities of Agricultural Science and Technology Innovation for the 14th Five-Year Plan of Guangdong Province
2023SDZG01
the Construction Project of Modern Seed Industry Park for Whiteleg Shrimp of Guangdong Province
GDSCYY2022-005
the Research project of Guangdong Ocean University
2021ZDZX1031