The Influence of Transcript Assembly on Proteogenomics Discovery of Microproteins

Name: The Influence of Transcript Assembly on Proteogenomics Discovery of Microproteins
Creator: Max Shokhirev
Published: 2018-01-29T20:17:37.490Z
Keywords: Genomics, Proteogenomics

Shokhirev, Max

doi:10.17632/sjbnjr7brz.1

The Influence of Transcript Assembly on Proteogenomics Discovery of Microproteins

Published: 29 January 2018| Version 1 | DOI: 10.17632/sjbnjr7brz.1

Contributor:

Max Shokhirev

Description

Supplementary dataset for "The Influence of Transcript Assembly on Proteogenomics Discovery of Microproteins" This dataset contains paired RNA-Seq reads simulated with flux-simulator in fastq.gz format. In addition, the flux-simulator parameter file is included as hg19.par. These are located in the flux_simulator folder. The reads were generated from the human hg19 genome in order to test transcript assembly. The hg19 refseq annotation was used to define genes (see hg19_refseq.gtf). The hg19 chromosome sequence files (e.g. chr1.fa) are also included for completeness. These are located in the hg19 folder.

Files

Steps to reproduce

Please download and install flux-simulator (V1.2.1 with Flux Library 1.22) and then run it with the supplied hg19 parameter file and using the hg19 genomics sequence and annotation. Due to the stochastic nature of the flux-simulator read generation process, reads generated should have similar distributions to the ones included and used for testing but may vary on a gene-by-gene basis. Also, please change the GEN_DIR in hg19.par to point to the directory containing the hg19 sequence and hg19_refseq.gtf file.

Institutions

Salk Institute for Biological Studies

The Influence of Transcript Assembly on Proteogenomics Discovery of Microproteins

Description

Files

Steps to reproduce

Institutions

Categories

Related Links

Licence