cover of episode Self-supervised Benchmarking for scRNAseq Clustering

Self-supervised Benchmarking for scRNAseq Clustering

2023/7/10
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.07.548158v1?rss=1

Authors: Tyler, S. R., Guccione, E., Schadt, E. E.

Abstract: Interpretation of single cell RNAseq (scRNAseq) data are typically built upon clustering results and/or cell-cell topologies. However, the validation process is often exclusively left to bench biologists, which can take years and tens of thousands of dollars. Furthermore, a lack of objective ground-truth labels in complex biological datasets, has resulted in difficulties when benchmarking single cell analysis methods. Here, we address these gaps with count splitting, creating a cluster validation algorithm, accounting for Poisson sampling noise, and benchmark 120 pipelines using an independent test-set for ground-truth assessment, thus enabling the first self-supervised benchmark. Anti-correlation-based feature selection paired with locally weighted Louvain modularity on the Euclidean distance of 50 principal-components with cluster-validation showed the best performance of all tested pipelines for scRNAseq clustering, yielding reproducible biologically meaningful populations. These new approaches enabled the discovery of a novel metabolic gene signature associated with hepatocellular carcinoma survival time.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC