CWL run of Somatic Variant Calling Workflow (CWLProv 0.5.0 Research Object)

Published: 4 December 2018| Version 3 | DOI: 10.17632/97hj93mkfd.3
Farah Zaib Khan, Stian Soiland-Reyes


The somatic variant calling workflow included in this case study is designed by Blue Collar Bioinformatics (bcbio), a community-driven initiative to develop best-practice pipelines for variant calling, RNA-seq and small RNA analysis workflows. According to the documentation, the goal of this project is to facilitate the automated analysis of high throughput data by making the resources quantifiable, analyzable, scalable, accessible and reproducible. All the underlying tools are containerized facilitating software use in the workflow. The somatic variant calling workflow defined in CWL is available on GitHub and equipped with a well defined test dataset. This dataset folder is a CWLProv Research Object that captures the Common Workflow Language execution provenance, see or use to explore


Steps to reproduce

To build the research object again, use Python 3 on macOS: Processor 2.8GHz Intel Core i7 Memory: 16GB OS: macOS High Sierra, Version 10.13.3 Storage: 250GB To run the workflow: pip3 install cwltool==1.0.20180912090223 git clone cd bcbio_test_cwlprov/somatic/somatic-workflow/ cwltool --provenance somaticwf_0.5.0_mac main-somatic.cwl main-somatic-samples.json To package the research object: zip -r somaticwf_0.5.0_mac/ sha256sum > Cloned git repository is a fork of It was obtained using: wget -O test_bcbio_cwl.tar.gz The content is from an archived version from the documentation here:


The University of Manchester School of Computer Science, The University of Melbourne Department of Computing and Information Systems


Bioinformatics, Workflow Management, Provenance