cover of episode Semi-supervised integration of single-cell transcriptomics data

Semi-supervised integration of single-cell transcriptomics data

2023/7/7
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.07.548105v1?rss=1

Authors: Andreatta, M., Herault, L., Gueguen, P., Gfeller, D., Berenstein, A. J., Carmona, S. J.

Abstract: Single-cell sequencing technologies offer unprecedented opportunities to characterize the complexity of biological samples with high resolution. At the same time, variations in sample processing and experimental protocols introduce technical variability - or "batch effects" - in the molecular readouts, hindering comparative analyses across samples and individuals. Although batch effect correction methods are routinely applied in single-cell omics analyses, data integration often leads to overcorrection, resulting in the loss of true biological variability. In this study, we present STACAS v2, a semi-supervised scRNA-seq data integration method that leverages prior knowledge in the form of cell type annotations to preserve biological variance. Through an open and reproducible benchmarking pipeline, we show that semi-supervised STACAS outperforms popular unsupervised methods such as Harmony, FastMNN, Seurat v4, scVI, and Scanorama, as well as supervised methods such as scANVI and scGen. Notably, STACAS is robust to incomplete and imprecise cell type annotations, which are commonly encountered in real-life integration tasks. Highlighting its scalability, we successfully applied semi-supervised STACAS to construct a high-resolution map of tumor-infiltrating CD8 T cells encompassing over 500,000 cells from 265 individuals. Based on our findings, we argue that the incorporation of prior cell type information should be a common practice in single-cell data integration, and we provide a flexible framework for semi-supervised batch effect correction. STACAS seamlessly integrates with Seurat pipelines and can be run with one command: Run.STACAS(seurat.list, cell.labels).

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC