Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.01.31.526521v1?rss=1
Authors: Kazantseva, E., Donmez, A., Pop, M., Kolmogorov, M.
Abstract: Bacterial species in microbial communities are often represented by mixtures of strains. Variation in strain genomes may have important phenotypic effects, however strain-level deconvolution of microbial communities remains challenging. Short-read approaches can be used to detect small-scale variation between strains, but fail to phase these variants into contiguous haplotypes. Recent advances in long-read metagenomics resulted in complete de novo assemblies of various bacterial species. However, current assembly approaches often suppress strain-level variation, and instead produce species-level consensus representation. Strain variants are often unevenly distributed, and regions of high and low heterozygosity may interleave in the assembly graph, resulting in tangles. To address this, we developed an algorithm for metagenomic phasing and assembly called stRainy. Our approach takes a sequence graph as input, identifies graph regions that represent collapsed strains, phases them and represents the results in an expanded and simplified assembly graph. We benchmark stRainy using simulated data and mock metagenomic communities and show that it achieves strain-level deconvolution with high completeness and low error rates, compared to the other strain assembly and phasing approaches.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC