CWL run of Somatic Variant Calling Workflow (CWLProv 0.5.0 Research Object)
Description
The somatic variant calling workflow included in this case study is designed by Blue Collar Bioinformatics (bcbio), a community-driven initiative to develop best-practice pipelines for variant calling, RNA-seq and small RNA analysis workflows. According to the documentation, the goal of this project is to facilitate the automated analysis of high throughput data by making the resources quantifiable, analyzable, scalable, accessible and reproducible. All the underlying tools are containerized facilitating software use in the workflow. The somatic variant calling workflow defined in CWL is available on GitHub and equipped with a well defined test dataset. This dataset folder is a CWLProv Research Object that captures the Common Workflow Language execution provenance, see https://w3id.org/cwl/prov/0.5.0 or use https://pypi.org/project/cwlprov/ to explore
Files
Steps to reproduce
To build the research object again, use Python 3 on macOS: Processor 2.8GHz Intel Core i7 Memory: 16GB OS: macOS High Sierra, Version 10.13.3 Storage: 250GB To run the workflow: pip3 install cwltool==1.0.20180912090223 git clone https://github.com/FarahZKhan/bcbio_test_cwlprov cd bcbio_test_cwlprov/somatic/somatic-workflow/ cwltool --provenance somaticwf_0.5.0_mac main-somatic.cwl main-somatic-samples.json To package the research object: zip -r somaticwf_0.5.0_mac.zip somaticwf_0.5.0_mac/ sha256sum somaticwf_0.5.0_mac.zip > somaticwf_0.5.0_mac.zip.sha256 Cloned git repository is a fork of https://github.com/bcbio/test_bcbio_cwl. It was obtained using: wget -O test_bcbio_cwl.tar.gz https://github.com/bcbio/test_bcbio_cwl/archive/master.tar.gz The content is from an archived version from the documentation here: https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#install-bcbio-vm-with-containers