Leishmania infantum_All-ORFs

Published: 18 October 2022| Version 2 | DOI: 10.17632/m2rs2cdm93.2
Contributors:
Jose M. Requena,

Description

This dataset contains all possible amino acids sequences, longer than 20 amino acids in lenght, derived from the reading of the Leishmania infantum (JPCM5 strain) genome in all six frames. This dataset was used for the analysis of proteomic data derived from L. infantum promastigotes as described elsewhere (Sanchiz, Á., Morato, E., Rastrojo, A., Camacho, E., González-de la Fuente, S., Marina, A., Aguado, B., and Requena, J.M. (2020). The Experimental Proteome of Leishmania infantum Promastigote and Its Usefulness for Improving Gene Annotations. Genes (Basel). 11, E1036; PMID: 32887454). The entry names contain the chromosome and coordinates in which a given amino acid sequence is encoded. For instance, the first entry is LinJ.01:197..358:r. This means that the amino acid sequence was read in the minus strand, between positions 197 and 358 in chromosome 1. The genome sequence used for creating this dataset may be accessed at: https://www.wikidata.org/wiki/Q97597959 Also, a dataset for the L. infantum genome sequence may be downloaded through the following Mendeley data entry: Requena, Jose M. (2021), “LINF_Genome sequence”, Mendeley Data, V1, doi: 10.17632/rb34cg9xk7.1 The assembly of this genome is described in the article: Gonzalez-de la Fuente, S., Peiro-Pastor, R., Rastrojo, A., Moreno, J., Carrasco-Ramiro, F., Requena, J.M., and Aguado, B. (2017). Resequencing of the Leishmania infantum (strain JPCM5) genome and de novo assembly into 36 contigs. Sci Rep 7, 18050 (PMID: 29273719).

Files

Steps to reproduce

LinJ-All-posibles-ORFs+20.txt (gff file) LinJ-All-posibles-proteins+20Aas.fasta (fasta file)

Institutions

Universidad Autonoma de Madrid

Categories

Protein, Amino Acid Sequence Analysis, Leishmania, Database

Licence