The Python script and files to build TMN

Published: 24 November 2022| Version 1 | DOI: 10.17632/683zs2ssyy.1


The Python script and files to build TMN in the manuscript of "A compounds annotation strategy combining hand-in-hand alignment with targeted molecular networking for offline two-dimensional liquid chromatography-mass spectrometry analysis: chemical profile of Yupingfeng, a classical traditional Chinese medicine prescription, as a case study"


Steps to reproduce

Step 1: ascertainment of FPI and FNL of targeted compounds. The FPI and FNL of targeted compounds were mined as the target for the components interested, which aimed to find the mass spectrometry features of the components interested in a way as accurate as possible to quickly locate and screen them in the next step. Step 2: development of a focused ions list. The MS-DIAL software was employed to rapidly focus on the relevant ions of the target compounds under complex chemical background based on the functions of PIF and NFL. Step 3: creation of MS2-similarity adjacency matrix. TMNs were developed depending on the MS2-similarity adjacency matrices of all ions in the focused ion list. A Python script was built to compute the mutual MS2 similarity of all focused ions, which in turn produced an MS2-similarity adjacency matrix. Based on an open MS2 similarity algorithm of “matchms", this Python script searched for and extracted the MS2 information corresponding to the ID of all the focused ion in the “.mgf” data, and further calculated and generated the MS2-similarity adjacency matrix. The written Python script, requirements and demo data were compressed into a “.rar” format file, which was available in Appendix B. Step 4: construction of networking node and edge table. The MS2-similarity adjacency matrix of focused ions was imported into Origin Pro software for initial adjacency-matrix networking analysis. The minimum similarity between the ions allowed to generate edges was limited as 0.75. Furthermore, the original node table and edge table for TMN were derived from Origin Pro software. The ΔRt and Δm/z between all nodes were calculated using the composite Excel function composed of “INDEX”, “MATC” and “IF”. In theory, in-source fragmentation ions and additive precursor ions originating from the same compound in the range of ΔRt ≤ 0.02 min caused potential data redundancy and identification false positives in the molecular network, with the expected high MS2 similarity. Therefore, in the edge table, nodes with ΔRt ≤ 0.02 min were considered as potentially redundant data (PRD), and whereas nodes with ΔRt > 0.02 min in the edge table or nodes without edge were considered as non-redundant data (NRD). The redundant data values (RD) in the node and edge tables related to PRD and NRD were assigned as 1 and 0, respectively. As a result, the node table and edge table of the TMNs were constructed. For the node table, it included the ID, Rt, m/z, Rt_ m/z, RD, and for the edge table it included the Source ID, Target ID, ΔRt, Δm/z, MS2 similarity. Step 5: visualization and annotation of TMNs. Finally, import the node table and edge table imported into Cytoscape software for further visualization and annotation of TMNs. All created focused ions list, MS2-similarity adjacency matrix, node and edge tables were made into another “.rar” format file, which is open access in Appendix C. The established TMNs were also shown in Fig. A.9-A.15 one by one.


Nanjing University of Chinese Medicine, Jinan University, Shanghai Institute of Materia Medica Chinese Academy of Sciences


Methods Development


National Natural Science Foundation of China