Transcriptomic analysis of immune-related genes in Pacific white snook (Centropomus viridis) gills infected with the monogenean parasite Rhabdosynochus viridisi
Description
Transcriptomic data of Pacific white snook (Centropomus viridis) gills infected with the monogenean parasite Rhabdosynochus viridisi
Files
Steps to reproduce
ExpressAnalyst was used for the Differential Expression Analysis (DEA). Raw reads were downloaded from the server and uploaded to the ExpressAnalyst Docker. The algorithm Seq2Fun was selected without reference transcriptome to perform quality control and map the reads to the “fishes” database (downloaded on April 24, 2024) from EcoOmics DataBase (EODB available https://www.ecoomicsdb.ca/). The “fishes” database contained protein sequences from 62 fish species. Seq2Fun algorithm performed translated searches of the RNAseq reads against this specific database, eliminating the possibility of annotation of orthologs from the parasite or other organisms. The Seq2Fun output files were a count table and an ortholog annotation table. The count table was uploaded to the web server ExpressAnalyst (https://www.expressanalyst.ca/ExpressAnalyst/uploads/TableUploadView.xhtml) with the following specifications: Specify organism = Generic/Species independent; Analysis Type = Differential Expression; Data Type = Counts (bulk RNA-seq); ID Type = Seq2Fun Ortholog ID; Metadata included. The annotation libraries in ExpressAnalyst are updated yearly, based on the latest ID versions available from NCBI (Entrez, RefSeq), Ensembl, and Uniprot. This pipeline was selected because perfoms similar to de novo transcriptome assembly in less time. Unannotated reads, reads with a count lower than four or with variance percentile rank lower than 15 (those with stable expression values across conditions), were filtered out and the remaining reads were normalized using the Relative Log Expression Normalization method. Simple Metadata, Limma Statistical Method without robust trend adjustment, and Specific Comparison between infected versus control groups were selected for DEA. Limma was selected since it uses moderated t-statistic (two-sided) and performs better than edgeR and DEseq2. DEGs were considered when the fold change (FC) threshold was two or more (log2|FC| ≥ 1) and the False Discovery Rate (FDR) cut-off was 0.05 (FDR ≤ 0.05). Overrepresentation analysis (ORA) and Gene Set Enrichment Analysis (GSEA) were used to identify significant KEGG pathways, Biological Processes and Molecular Functions from the Gene Ontology (GO) database.
Institutions
Categories
Funding
Mexican National Council on Humanities, Science and Technology (CONAHCYT)
1715616