Genome map of low-complexity regions, simple sequence repeats (SSRs), and putative open reading frames (ORFs) in the chromosome-level assembly YMM1 of yerba mate (Ilex paraguariensis)

Published: 20 April 2026| Version 1 | DOI: 10.17632/cbjg3y8nvd.1
Contributors:
Mauro Grabiele,

Description

This dataset comprises genome annotation files (GFF format) detailing various features identified on the chromosome-level assembly YMM1 of yerba mate (Ilex paraguariensis). These annotations were generated by applying the following tools: 1. RepeatMasker: For the identification and classification of low-complexity regions (result: 193,751 features). 2. SSR finder: For the localization of Simple Sequence Repeats (SSRs) or microsatellites (result: 167,882 features). 3. getorf: For the prediction of potential Open Reading Frames (ORFs) (result: 13,493,413 features). The resulting comprehensive annotated loci files are valuable resources and are now available to the interested community for downstream genomic analysis.

Files

Steps to reproduce

Analyzed genome is available at: https://ngdc.cncb.ac.cn/omix/release/OMIX007912 https://data.mendeley.com/datasets/8v9ws627tg/1 Methods: RepeatMasker 4.1.1 (option: only masks low complex/simple repeats). SSR finder 1.0 (Dinucleotides: ssr_2_tb : 5; Trinucleotides: ssr_3_tb : 4; Tetranucleotides: ssr_4_tb : 3; Pentanucleotides: ssr_5_tb : 3; Hexanucleotides: ssr_6_tb : 3). Getorf 6.6.0 (Min Nucleotide Size of ORF to Report: getorf_min_tb : 30; Max Nucleotide Size of ORF to Report: getorf_max_tb : 1000000).

Institutions

Categories

Genomics, Plant Biology, DNA Repeated Sequence

Licence