DATASET S1: Transcriptomic profile of the cockle Cerastoderma edule exposed to Diarrhetic Shellfish Toxins seasonal contamination
Description
Resulting .fasta and .gff files obtained from de novo transcriptome assembly and clustering analyses in the article: Transcriptomic profile of the cockle Cerastoderma edule exposed to Diarrhetic Shellfish Toxins seasonal contamination. Domínguez-Pérez, D. et al., 2021. Ce_assembly_unique.fasta: The de novo transcriptome assembly of C. edule before submission to NCBI, obtained with Trinity v2.10.0 by combining four samples (gills and digestive glands without DSTs and exposed) into a single transcriptome. C.edule_assembly_submitted_ncbi.fasta: The de novo transcriptome assembly submitted NCBI (registered with the BioProject ID: PRJNA739261 (http://www.ncbi.nlm.nih.gov/bioproject/739261). It is noteworthy, a total of 147 transcripts with length below 200 bp were removed during NCBI submission. Ce_assembly_unique_supertranscript.fasta: The corresponding supertranscript file constructed from the de novo transcriptome assembly of C. edule. Ce_assembly_unique.gff: The corresponding .gff file obtained from the de novo transcriptome assembly of C. edule. transcript_to_gene_map.txt: The corresponding transcripts-to-gene-mapping file obtained from the de novo transcriptome assembly of C. edule, and used for clustering analyses. rna_seq_de_novo_assembly_trinity_output.pdf: The resulting output file containing the statistics summary of the de novo transcriptome assembly of C. edule. Ce_assembly_unique_clustering.fasta: The resulting clustering .fasta file from the de novo transcriptome assembly of C. edule, using CD-HIT 4.8.1, setting 0.9 as the Sequence Identity Threshold. cluster_distribution_[assembled_transcripts].pdf: The figure shows the clusters distribution of the assembled transcripts. clustering_results_assembled_transcripts.pdf: Summary of the clustering analyses using CD-HIT 4.8.1.