cover of episode RabbitVar: ultra-fast and accurate somatic small-variant calling on multi-core architectures

RabbitVar: ultra-fast and accurate somatic small-variant calling on multi-core architectures

2023/1/6
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.01.06.522980v1?rss=1

Authors: Zhang, H., Song, H., Yin, Z., Chang, Q., Wei, Y., Niu, B., Schmidt, B., Liu, W.

Abstract: The continuous development of next-generation sequencing (NGS) technology has led to extensive and frequent use of genomic analysis in cancer research. The associated production of large-scale NGS datasets establishes the need for high-precision somatic variant calling methods that are highly optimized on commonly used hardware platforms. We present RabbitVar (https://github.com/LeiHaoa/RabbitVar), a scalable variant caller that can detect small somatic variants from paired tumor/normal NGS data on modern multi-core CPUs. Our approach combines candidate-finding and machine-learning-based filtering strategies with optimized data structures and multi-threading to achieve both high accuracy and efficiency. We have compared the performance of RabbitVar to leading state-of-the-art callers (Strelka2, Mutect2, NeuSomatic, VarDict, VarScan2) on real-world HCC1395 breast cancer datasets under different sequencing conditions and contamination rates. The evaluation results demonstrate that RabbitVar achieves highly competitive F1-scores when calling SNVs. Moreover, when calling the more challenging indel variants, it consistently achieves the highest F1-scores. RabbitVar is able to process a paired tumor and normal whole human genome sequencing datasets with 80x depth in less than 20 minutes on a 48-core workstation outperforming all other tested variant callers in terms of efficiency.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC