Dataset for Neural Network

Published: 4 October 2022| Version 2 | DOI: 10.17632/26cy79sv84.2
Ke Chen


This is the dataset for the manuscript named with "Neural Network Establish Co-Occurrence Links between Transformation Products of Contaminants and Soil Microbiomes". Quantitative data of HRMS were provided here. The abstract for the manuscript is as following: It is still challenging for ecologists and environmentalists to identify which microorganisms are carrying out specific metabolic processes in the natural environment, even though stable isotope probing (e.g., DNA-SIP) could link degraders and their substrates. As a new strategy, we combined the use of a network-based algorithm, MMvec, and our developed 2H-labeled Stable Isotope-Assisted Metabolomics pipeline (2H-SIAM) to discover links between transformation products (TPs) of the contaminant and microbes in soils. Abiotic stresses were firstly used to constitute the assembly of soil microbiomes, characterized by 16S rRNA gene sequencing. Pyrene and pyrene-d10 were added into soils for biodegradation, and 2H-SIAM was used to obtain TPs of pyrene. Then, MMvec was used to establish a co-occurrence network between TPs and microbiomes. The results confirmed the role of Pseudomonas and Phenylobacterium in the oxidation, mineralization, and methylation of pyrene. Sphingomonas and phylum Acidobacteria contributed to the oxidation of pyrene. The obtained co-occurrence network was markedly following the reports studied by DNA-SIP, indicating the performance and reliability of the co-occurrence network. In summary, we firstly depict the links between TPs and microbes in the environment matrix, which exhibits unique advantages comparing to the other isotope-based approaches.. The installation of the MMvec please refer to In this study, the MMvec was carried out in qiime2-2020.6 platform, and the MMvec was carried out with the codes provided in a .doc document. Additionally, two necessary documents, "lcms_nt.txt" and "otus_nt.txt", for the evaluation of the MMvec are provided here.


Steps to reproduce

Soil Samples Crude soil was collected from the campus as in our previous study 29. They were air-dried and sieved (< 2 mm) to remove debris, mixed with Cd2+ (CdSO4) or Cu2+ (CuSO4) (10 ppm or 100 ppm in Milli-Q water) respectively, and kept in the lab under room temperature for a 90-days acclimation. Pyrene and pyrene-d10 (in acetone, ACE) were then added into one-quarter of the soils and mixed with the rest soils to obtain soils with 20 ppm pyrene or pyrene-d10. After that, soils were incubated in Petri dishes in the lab and watered with Milli-Q water (EQ7000, Waters-Millipore Corporation, Milford, MA) twice per month to keep moisture. There were 4 different pyrene treatments (PyrCd10, PyrCd100, PyrCu10, and PyrCu100) with 4 independent replicates (n = 16). Per gram of contaminated soil was extracted by acetone (ACE) and hexane (HEX) (1:1, v/v) by microwave extraction (Anton Paar GmbH, Multiwave PRO, Austria) with the addition of ortho-terphenyl (OTP) as extraction surrogate, and sodium sulfate (Na2SO4) were used to remove residual water. The solvent was subsequently replaced by acetonitrile (MeCN) by solvent exchange, and extracts were concentrated to 0.5 mL under nitrogen flow. The column used for HRMS was a Thermo HypersilGoldC18 (250 × 4.6 mm, 5 μm). HRMS was carried out by UPLC-ESI-HRMS, Ultimate 3000 (Dionex) coupled with a Q Exactive Orbitrap mass spectrometers (ThermoFisher Scientifc, USA) and heated electrospray ionization (ESI) source. For UPLC-ESI-HRMS, following chromatographic condition was used in UPLC-ESI-HRMS analysis: 3 μL of samples was injected into UPLC-ESI-HRMS system. UPLC solvents were A, water with 0.1% formic acid (FA), and B, MeCN with 0.1% FA. UPLC were performed at 1 mL/min at 25 oC with the following linear gradient (minutes, %B): 0, 5%; 4, 5%; 8, 95%; 26, 95%; 28, 5%; 30, 5%. Analysis was carried out in positive ionization mode with a resolving power of 70 000 FWHM (full width at half maxima) at m/z 200. For quantitative study, extracts from pyrene-treated samples were determined by UPLC-ESI-HRMS. Raw data from HRMS were transformed to .mzXML format and imported into MZmine2 (2.53) to obtain a feature list, and signals were normalized by total ion signals. Features annotated as TPs by 2H-SIAM were picked up for the co-occurrence study of TPs and microbes.


Mass Spectrometry, Contamination, Neural Network