Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles

Name: Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles
Creator: Alicia Foxx
Published: 2023-05-24T07:06:13.404Z
Keywords: Microbiome, Seed

Foxx, Alicia; Rivers, Adam

doi:10.17632/5xrfg5dym6.1

Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles

Published: 24 May 2023| Version 1 | DOI: 10.17632/5xrfg5dym6.1

Contributors:

Alicia Foxx,

Description

Batch variation is unwanted variation that plagues syntheses of microbiome sequence data. Batch effects correction algorithms (BECAs) aim to remove batch effects, but most BECAs do not account for a common problem whereby batch covariates of interest are imbalanced (e.g., when classes do not appear in all batches or in even sample proportions). Here we tested five BECAs on eight seed microbiome studies which are prone to severe batch effects due to variable seed handling practices. We compared the performance of BECAs including zero-mean centering (ZMC), Ratio-A, ConQuR, PLSDA, and wPLSDA (developed for imbalanced batch-covariates). We also account for the sparsity and compositionality of microbiome data with zero imputation and center log ratio transformation (CLR). We found 1) using a redundancy analysis, that no method reduced variation explained by the unwanted covariate to zero; 2) ConQuR, Ratio-A, and ZMC removed the magnitude of batch effects per a guided principal component analysis which quantifies the magnitude of batch effects (δ = 0, p<0.001); and 3) CLR and zero imputation improved the removal of batch effects and variance explained by the wanted variable by ZMC. These results call for careful application of BECASs and indicate that ZMC, Ratio-A, ConQuR provide some improvements in remediating batch effects in batch-covariate imbalanced data. Continued development of BECAs is urgently required for successful use for batch corrections in this use case.

Data: Benchmarking batch correction methods for synthesizing imbalanced microbiome community profiles

Description

Files

Steps to reproduce

Institutions

Categories

Funders

Licence