IPF multi-omics datasets

Published: 20 October 2022| Version 1 | DOI: 10.17632/4zr8jjmzhy.1


Metabolomics and lipidomics datasets of IPF and NDC apex and base lung explants.


Steps to reproduce

Raw LCMS data was processed using the metabolome-lipidome-MSDIAL pipeline https://github.com/respiratory-immunology-lab/metabolome-lipidome-MSDIAL. Briefly, MS-DIAL v4.7 was used for peak detection and alignment against the MassBank database v2021.02 for metabolites and MS-DIAL internal lipid database for lipids. Parameters were as follows: minimum peak amplitude of 100000, retention time tolerance parameter 1 min and mass tolerance to 0.002Da with gap-filling. Other parameters were left to default. MS-DIAL intensity tables were imported in R and processed using the pmp R package v1.6.0 for LCMS data preprocessing. Peaks were quality filtered based on the following parameters: intensities at least 5-fold higher than ⅔ of the blank samples, features present in at least ⅔ of the samples, and a maximum relative standard deviation in the QC samples of 20%. Peak intensities were normalized using probabilistic quotient normalization (PQN), followed by random forest missing data imputation, and subsequent generalized logarithmic (glog) transformation to stabilize the variance across low and high intensity mass spectral features. Filtered metabolomics peaks were further mapped to the Human Metabolome database (HMDB) v4.202107 with a mass tolerance of 0.002Da. Annotated features were kept and curated using a combination of the MS-DIAL fill percentage and signal-to-noise ratio values, along with visual confirmation.


Monash University