Genome map of low-complexity regions, simple sequence repeats (SSRs), and putative open reading frames (ORFs) in the chromosome-level assembly YMM1 of yerba mate (Ilex paraguariensis)
Description
This dataset comprises genome annotation files (GFF format) detailing various features identified on the chromosome-level assembly YMM1 of yerba mate (Ilex paraguariensis). These annotations were generated by applying the following tools: 1. RepeatMasker: For the identification and classification of low-complexity regions (result: 193,751 features). 2. SSR finder: For the localization of Simple Sequence Repeats (SSRs) or microsatellites (result: 167,882 features). 3. getorf: For the prediction of potential Open Reading Frames (ORFs) (result: 13,493,413 features). The resulting comprehensive annotated loci files are valuable resources and are now available to the interested community for downstream genomic analysis.
Files
Steps to reproduce
Analyzed genome is available at: https://ngdc.cncb.ac.cn/omix/release/OMIX007912 https://data.mendeley.com/datasets/8v9ws627tg/1 Methods: RepeatMasker 4.1.1 (option: only masks low complex/simple repeats). SSR finder 1.0 (Dinucleotides: ssr_2_tb : 5; Trinucleotides: ssr_3_tb : 4; Tetranucleotides: ssr_4_tb : 3; Pentanucleotides: ssr_5_tb : 3; Hexanucleotides: ssr_6_tb : 3). Getorf 6.6.0 (Min Nucleotide Size of ORF to Report: getorf_min_tb : 30; Max Nucleotide Size of ORF to Report: getorf_max_tb : 1000000).
Institutions
- Universidad Nacional de MisionesMisiones, Posadas
- CONICET NordesteCorrientes, Corrientes
- Instituto de Biologia SubtropicalMisiones Province, Posadas