cover of episode scSPARKL: Apache Spark based parallel analytical framework for the downstream analysis of scRNA-seq data.

scSPARKL: Apache Spark based parallel analytical framework for the downstream analysis of scRNA-seq data.

2023/4/8
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.07.536003v1?rss=1

Authors: Adil, A., Bhattacharya, N., Asger, M.

Abstract: As the field of single-cell genomics continues to develop, the generation of large-scale scRNA-seq datasets has become more prevalent. While these datasets offer tremendous potential for shedding light on the complex biology of individual cells, the sheer volume of data presents significant challenges for management and analysis. To address these challenges, a new discipline, known as "big single-cell data science," has emerged. Within this field, a variety of computational tools have been developed to facilitate the processing and interpretation of scRNA-seq data. In this paper, we present a novel parallel analytical framework, scSPARKL, that leverages the power of Apache Spark to enable the efficient analysis of single-cell transcriptomic data. Our methodology incorporates six key operations for dealing with single-cell Big Data, including data reshaping, data preprocessing, cell/gene filtering, data normalization, dimensionality reduction, and clustering. By utilizing Spark's unlimited scalability, fault tolerance, and parallelism, scSPARKL enables researchers to rapidly and accurately analyze scRNA-seq datasets of any size. We demonstrate the utility of our framework through a series of experiments on simulated and real-world scRNA-seq data. Overall, our results suggest that scSPARKL represents a powerful and flexible tool for the analysis of single-cell transcriptomic data, with broad applications across the fields of biology and medicine.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC