M-CAMP™ Cloud Platform 16S-seq Pipeline Validation Data

Published: 23 August 2021| Version 1 | DOI: 10.17632/xyhphwr4b2.1
Andrew Schriefer


The M-CAMP™ Cloud Platform provides users with an easy-to-use web interface to analyze microbiome DNA-sequencing data with best in class bioinformatics tools. The M-CAMP™ 16S-seq taxonomic classification uses a novel approach combining alignment and kmer based taxonomic classification methodologies to produce a highly accurate and comprehensive profile. In this study we compare the performance of the M-CAMP™ taxonomic classification to previously published 16S classification methods on a large set of validation samples sequenced using V3-V4 hypervariable region primers. Included in this archive is the data used to validate the microbial taxonomic calls made by the M-CAMP™ 16S-seq classification pipeline. The paired-end fastqs of 62 validation samples are contained within the 16s_seq_data.tar archive. This archive includes fastqs published in previous studies and data sequenced for this study. Metadata about each sample can be found in "Table S1" of supplemental_tables.xlsx. The known relative abundances of all the samples at multiple taxonomic levels is found in "Table S2". Samples with "NA" in the "Number of Species" column of "Table S1" do not have known relative abundances. "Table S3" contains performance metrics for the M-CAMP™ 16S-seq classification pipeline and several existing methods used for reference. The primers used to target the V3V4 region in this study are included in the file v3v4_primers.fasta. The supplemental_methods.docx describes the commands used to install and run the previously published classification methods in this study.