SeqLengthPlot outputs on ORFs from the single-end transcriptome of Savalia savaglia

Published: 11 June 2024| Version 1 | DOI: 10.17632/sh79mdcm2c.1
Contributors:
Dany Domínguez Pérez,
,
,
,

Description

This dataset contains the output folder compiled by SeqLengthPlot assessed on the Open Reading Frames (ORFs) translated with the TransDecoder, from the single-end transcriptome of Savalia savaglia. The folder seq_length_Assembly_Ss_SE.Trinity.fasta.transdecoder contains: • seq_above99aa.fasta: Retrieved FASTA file containing the translated ORFs with lengths of 100 aa and above, after splitting of the input FASTA file based on the given threshold. • seq_below100aa.fasta: Retrieved FASTA file containing the translated ORFs with lengths below 100 aa, after splitting of the input FASTA file based on the given threshold. • seq_length_distribution_above99aa.png: PNG image file showing a histogram of ORF lengths of 100 aa and above on a linear scale. • seq_length_distribution_above99_log.png: PNG image file showing a histogram of ORF lengths of 100 aa and above on a logarithmic scale. • seq_length_distribution_below100aa.png: PNG image file showing a histogram of ORF lengths below 100 aa on a linear scale. • seq_length_distribution_below100_log.png: PNG image file showing a histogram of ORF lengths below 100 aa on a logarithmic scale. • seq_length_stats_by_threshold_100.txt: Text file containing detailed statistics of the ORF lengths in the input FASTA file, including the total number of sequences, the number of sequences 100 aa and above, the number of sequences below 100 aa, and the corresponding minimum and maximum lengths.

Files

Steps to reproduce

The folder seq_length_Assembly_Ss_SE.Trinity.fasta.transdecoder is the resulting output of applying the python-based script SeqLengthPlot.py on the ORFs Assembly_Ss_SE.Trinity.fasta.transdecoder.pep, using a length cuttof of 100 amino acids (aa). The input FASTA file was previously translated with the TransDecoder v5.7.1 from the single-end transcriptome of Savalia savaglia.

Institutions

Stazione Zoologica Anton Dohrn

Categories

Sequence Analysis

Funding

This work was supported by Centro Ricerche ed Infrastrutture Marine Avanzate in Calabria (CRIMAC) - Fondo FSC 2014-2020 - Piano Stralcio «Ricerca e Innovazione 2015-2017» – Programma Nazionale Infrastrutture di Ricerca (PNIR), CUP C64I20000320001.

Licence