Published: 17 April 2023| Version 1 | DOI: 10.17632/6b54424fgs.1
Jose M. Requena,


This dataset contains coordinates for all possible ORFs (longer than 20 triplets in length), and the corresponding amino acids sequences, derived from the reading of the Leishmania donovani (HU3 strain) genome in all six frames. Hence, two files are provided: (i) LdHU3_All-ORFabove20Aas (GFF file/text file), which contains genomic coordinates of the ORFs; (ii) LdHU3_TodosPeptidos+20, which contains the polypeptide sequences. In both files, the entries are named with the chromosome number and the coordinates of the ORF. For instance, the first entry is LdHU3.01:442..603:r (this means that the ORF/amino acid sequence was read in the minus strand (:r)), between positions 442 and 603 in chromosome 1. The genome sequence used for creating this dataset may be accessed at: https://www.wikidata.org/wiki/ Q97940468 Also, a dataset with the L. donovani genome sequence (fasta file) may be downloaded through the following Mendeley data entry: Requena, Jose M. (2023), “LdHU3_Genome sequence”, Mendeley Data, V2, doi: 10.17632/b82fm2w2h9.2 The assembly of this genome is described in the article: Camacho, E., González-de la Fuente, S., Rastrojo, A., Peiró-Pastor, R., Solana, J.C., Tabera, L., Gamarro, F., Carrasco-Ramiro, F., Requena, J.M., and Aguado, B. (2019). Complete assembly of the Leishmania donovani (HU3 strain) genome and transcriptome annotation. Sci. Rep. 9, 6127. (PMID: 30992521). https://rdcu.be/bxaEi



