cover of episode Improved interpretability of bacterial genome-wide associations using gene cluster centric k-mers

Improved interpretability of bacterial genome-wide associations using gene cluster centric k-mers

2023/4/12
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.11.536385v1?rss=1

Authors: Neubauer, H., Galardini, M.

Abstract: The wide adoption of bacterial genome sequencing and encoding both core and accessory genome variation using k-mers has allowed bacterial genome wide association studies (GWAS) to identify genetic variants associated with relevant phenotypes such as those linked to infection. Significant limitations still remain as far as the interpretation of association results is concerned, which affects the wider adoption of GWAS methods on microbial datasets. We have developed a simple computational method (panfeed) that explicitly links each k-mer to their gene cluster at base resolution level, which allows us to avoid biases introduced by a global de Bruijn graph as well as more easily map and annotate associated variants. We tested panfeed on two independent datasets, correctly identifying previously characterized causal variants, which demonstrates the precision of the method, as well as its scalable performance. panfeed is a command line tool written in the python programming language and available at https://github.com/microbial-pangenomes-lab/panfeed.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC