monoPGT Superfamily Sequence Similarity Network and Genome Neighborhood Network

Published: 22 January 2021| Version 1 | DOI: 10.17632/zcx42s9mzf.1


Cytoscape compatible files of the sequence similarity network (SSN) and genome neighborhood network (GNN) of the monoPGT Superfamily. Dataset S1 is an all-by-all blast 40% representative node network file, E-value: 1 x 10-90. Dataset S2 is a genome neighborhood network file generated from the 40% representative node network (E-value: 1 x 10-90). Dataset S3 is a genome neighborhood diagram of selected polyPGT_F-monoPGT_NF members, visualized in SI Appendix Figure S4 of accompanying manuscript. Dataset S4 is a spreadsheet containing UniProt IDs and FASTA header information for all representative nodes in the sequence similarity network.


Steps to reproduce

See manuscript for details: Glycoconjugate pathway connections revealed by sequence similarity network analysis of the monotopic phosphoglycosyl transferases Katherine H. O’Toole, Barbara Imperiali, and Karen N. Allen Proc. Nat. Acad. Sci.


Massachusetts Institute of Technology, Boston University


Glycoconjugates, Membrane Protein, Superfamily Evolution