Profile Hidden Markov Model trained on UDP-diNAcBac PGT cluster, multiple sequence alignment and tree of 4,693 bacterial monotopic PGTs

Published: 20 July 2023| Version 1 | DOI: 10.17632/b57wtpx78y.1


File: smPGT_diNAcBac_hmm_102522.hmm This is a Profile Hidden Markov Model which can be applied to monotopic bacterial phosphoglycosyl transferases to assign substrate specificity. This model was generated using HMMER 3.3 using a randomly selected half-set of PGTs which were found in the diNAcBac cluster of the attached tree. PGT sequences which score > 200 using this model will likely utilize UDP-diNAcBac as their preferred substrate. File: sm_PGT_alignment_012323.fasta This is a multiple sequence alignment of 4,693 bacterial monotopic PGTs. File: smPGT_tree_012323.newick This is a tree generated in geneious prime of the above multiple sequence alignment.



Massachusetts Institute of Technology, Boston University


Hidden Markov Models, Sequence Analysis


National Institutes of Health

GM131627, GM039334, GM134576