Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.03.10.532096v1?rss=1
Authors: Yamaguchi, K., Abdelbaky, S., Yu, L., Oakes, C. C., Coombes, K. R.
Abstract: Motivation: The rapid growth in the number and application of high-throughput "omics" technologies has created a need for better methods to integrate multiomics data sets. Much progress has been made in developing unsupervised methods, but supervised methods have lagged behind. Results: We develop a novel algorithm, plasma, to train and validate models to predict time-to-event outcomes from multiomics data sets. The model is built on using two layers of the existing partial least squares algorithm to first select components that covary with the outcome in order to construct a joint Cox proportional hazards model. We apply plasma to the lung squamous cell carcinoma (LUSC) data from The Cancer Genome Atlas (TCGA). Our model successfully separates an independent test data set into high risk and low risk patients (p = 0.0132). The performance of the joint multiomics model is superior to that of the individual omics data sets. It is also superior to the performance of an approach that uses an unsupervised method (Multi Omics Factor Analysis; MOFA) to find factors that might work as predictors. Many of the factors that contribute strongly to the plasma model can be justified from the biological literature.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC