cover of episode NOMAD2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads

NOMAD2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads

2023/3/21
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.03.17.533189v1?rss=1

Authors: Kokot, M., Dehghannasiri, R., Baharav, T. Z., Salzman, J., Deorowicz, S.

Abstract: NOMAD is a new, unsupervised, reference-free, and unifying algorithm that discovers regulated sequence variation through statistical analysis of k-mer composition in DNA or RNA sequencing experiments. It subsumes many application-specific algorithms, from splicing detection to RNA editing to applications in DNA-sequencing and beyond. Here, we introduce NOMAD2, a fast, scalable, and user-friendly implementation of NOMAD based on KMC, an efficient k-mer counting approach. The pipeline has minimal installation requirements, and can be executed with a single command. NOMAD2 enables efficient analysis of massive RNA-Seq datasets where it reveals novel biology, showcased by rapid analysis of 1,553 human muscle cells, the entire Cancer Cell Line Encyclopedia (671 cell lines, 5.7 TB) and a deep RNAseq study of Amyotrophic Lateral Sclerosis (ALS) with ~2 fold less computational resource and time than state of the art alignment methods. NOMAD2 enables reference-free biological discovery at unmatched scale and speed. By bypassing genome alignment, we provide examples of its new insights into RNA expression in normal and disease tissue, to introduce NOMAD2 to enable expansive biological discovery not previously possible.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC