Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.01.30.526187v1?rss=1
Authors: Hui, H. W. H., Goh, W. W. B.
Abstract: Statistical analyses in high-dimensional omics data are often hampered by the presence of batch effects (BEs) and missing values (MVs), but the interaction between these two issues is not well-studied nor understood. MVs may manifest as a BE when their proportions differ across batches. These are termed as Batch-Effect Associated Missing values (BEAMs). We hypothesized that BEAMs in data may introduce bias which can impede the performance of missing value imputation (MVI). To test this, we simulated data with two batches, then introduced over 100 iterations, either 20% and 40% MVs in each batch (BEAMs) or 30% in both (control). K-nearest neighbours (KNN) was then used to perform MVI, in a typical global approach (M1) and a supposed superior batch-sensitized approach (M2). BEs were then corrected using ComBat. The effectiveness of the MVI was evaluated by its imputation accuracy and true and false positive rates. Notably, when BEAMs existed, M2 was generally undesirable as the differing application of MV filtering in M1 and M2 strategies resulted in an overall coverage deficiency. Additionally, both M1 and M2 strategies suffered in the presence of BEAMs, highlighting the need for a novel approach to handle MVI in data with BEAMs.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC