Genome Mining Data of Bacillus cabrialesii TE3T
This dataset contains (1) the Bacillus cabrialesii TE3T genome, (2) the Bacillus cabrialesii TE3T genome-mining results and (3) results and BGC sequences used to construct BGC networking of Bacillus cabrialesii TE3T BGCs through BiG-SCAPE.
Steps to reproduce
Genome Assembly: The quality of raw reads was checked using FastQC and trimmed to remove the adapter and low-quality sequences with Trimmomatic (phred score of 30, and no more than two ambiguous bases per read). After trimmed, high-quality reads were assembled using SPAdes (version 3.10.1) genome assembler with the ‘—careful’ parameter for read error correction, and a set of k-mer lengths of 21, 33, 55, 77, 99 and 127 (optimal value = 51). Mauve contig Mover (MCM) was used for contig reordering using the reference genome of B. subtilis CW14 (NCBI project accession PRJNA330772).. Genome mining: Contigs were concatenated as one unique sequence using the Union tool from the Emboss package, and submitted to antibiotics & Secondary Metabolite analysis shell (antiSMASH) web-server (https://antismash.secondarymetabolites.org) under detection strictness parameter = strict. BGC networking: BGC networking was constructed including sequences of Bacillus subtilis group strains obtained from i) antiSMASH database (antiSMASH-DB) and ii) Minimum Information about a Biosynthetic Gene cluster database (MIBiG), by using a locally installed version of the BiG-SCAPE software, with the local option enabled and a distance cut-off score = 0.3.