cover of episode The effect of removing repeat-induced overlaps in de novo assembly

The effect of removing repeat-induced overlaps in de novo assembly

2023/4/18
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.16.537101v1?rss=1

Authors: Shiarli Hossein Zade, R., Abeel, T.

Abstract: Determining accurate genotypes is important for associating phenotypes to genotypes. De novo genome assembly is a critical step to determine the complete genotype for species for which no reference exists yet. The main challenge of de novo eukaryote genome assembly, particularly plant genomes, are repetitive DNA sequences within their genomes. The introduction of third generation sequencing and corresponding long reads has promised to resolve repeat-related problems. While there have been notable improvements, reads originating from these repeats are still introducing errors because they introduce false overlaps in the assembly graph. This study focuses on analyzing the effect of repeats on de novo assembly and improving performance of existing de novo assembly algorithms by removing repeat-induced overlaps. First, we show the possible improvements in de novo assembly with removing repeat-induced overlaps. Then we propose several methods for detecting and removing repeat-induced overlaps and evaluate their performance on several simulated datasets.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC