Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.04.547735v1?rss=1
Authors: Constantinides, B., Crook, D. W.
Abstract: Microbial sequences generated from clinical samples are often contaminated with human host sequences that must be removed for ethical and legal reasons. Care must be taken to excise host sequences without inadvertently removing target microbial sequences to the detriment of downstream analyses such as variant calling and de novo assembly. To facilitate accurate host decontamination of both short and long sequencing reads, we developed Hostile, a tool capable of rapid host read removal using laptop specification hardware. We demonstrate that our approach removes at least 99.868% of real human reads and retains at least 99.997% of simulated bacterial reads. Use of a masked reference genome further increases bacterial read retention ( greater than =99.997%) with negligible ( less than 0.001%) reduction in human read removal performance. Compared with an existing tool, Hostile removed up to 11x more human reads and up to 11x fewer microbial reads while taking less time for typical workloads. Hostile is implemented as an MIT licensed Python package available at https://github.com/bede/hostile
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC