Unprecedented genetic diversity suggests importance of understudied PFam54 paralogs to Lyme borreliosis spirochetes
This data accompanies the pre-print found here DOI: 10.22541/au.169323152.28257088/v1 The goal of the project was to understand what level of standing variation exists on the PFam54 gene array in the three major Borrelia species which can cause Lyme borreliosis in humans across Eurasia (Borrelia afzelii, Borrelia bavariensis, and Borrelia garinii). The data uploaded here refers to all of the raw inputs for the various analyses done (i.e., phylogenetic reconstruction, selection analysis) and all outputs of these analyses. The basis of the analysis is utilizing whole genome sequencing data produced as part of a larger project for samples described in a previous publication: DOI: 10.1111/mec.16805.
Steps to reproduce
All SRA files can be found on GenBank under the BioProject numbers PRJNA327303, PRJNA449844, and PRJNA722378. These are Illumina MiSeq datasets which we then assembled using SPAdes prior to using the mapping protocol outlined in https://doi.org/10.1186/s12864-020-07054-3. In these de novo assembled genomes, we then searched the lp54 sequences using BLAST (algorithm: blastn) to search for the lp54 located PFam54 genes and then characterized manually the presence or absence of genes where gaps occurred in the gene array.
Robert Koch Institut
LOEWE Center DRUID (Novel Drug Targets against Poverty-Related and Neglected Tropical Infectious Diseases)