ORFs and annotations of the single-end transcriptome from the forward unpaired reads of Savalia savaglia's RNAseq

Published: 11 June 2024| Version 1 | DOI: 10.17632/3rtbr7c9s8.1
Dany Domínguez Pérez,


This dataset contains Open Reading Frames (ORFs) and annotations of the original single-end transcriptome outputs, obtained from the forward broken paired-end RNAseq of the false coral Savalia savaglia. The dataset includes the following files: From TransDecoder analyses: • Assembly_Ss_SE.Trinity.fasta.transdecoder.cds: Nucleotide sequences for coding regions of the final candidate ORFs. • Assembly_Ss_SE.Trinity.fasta.transdecoder.gff3: Positions within the target transcripts of the final selected ORFs. • Assembly_Ss_SE.Trinity.fasta.transdecoder.pep: Peptide sequences for the final candidate ORFs, with shorter candidates within longer ORFs removed. • Assembly_Ss_SE.Trinity.fasta.transdecoder.bed: BED-formatted file describing ORF positions, suitable for viewing using GenomeView or IGV. • blastp.outfmt6.w_pct_hit_length: File providing percentages of hit lengths from BLASTp results, including top hit's length and percent of the length covered in the alignment. • pfam.domtblout: PFAM domain annotations for the predicted proteins. From Trinotate analyses: • myTrinotate_SE_Ss.tsv: Comprehensive annotation file with results from Trinotate, including protein domain identification and other annotations. • Trinotate_SE_Ss_report.cXp_summary.html: HTML report summarizing the annotation results from Trinotate, providing an overview of the functional annotations and transcript features.


Steps to reproduce

ORFs and annotation were generated by TransDecoder v5.7.1 and Trinotate v4.0.2 using the single-end transcriptome from the forward broken paired-end reads of the false coral Savalia savaglia.


Stazione Zoologica Anton Dohrn


Transcriptomics, Protein Annotation


This work was supported by Centro Ricerche ed Infrastrutture Marine Avanzate in Calabria (CRIMAC) - Fondo FSC 2014-2020 - Piano Stralcio «Ricerca e Innovazione 2015-2017» – Programma Nazionale Infrastrutture di Ricerca (PNIR), CUP C64I20000320001.