cover of episode Fast genome-based delimitation of Enterobacterales species

Fast genome-based delimitation of Enterobacterales species

2023/4/6
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.05.535762v1?rss=1

Authors: Hernandez-Salmeron, J. E., Irani, T., Moreno-Hagelsieb, G.

Abstract: Average Nucleotide Identity (ANI) is becoming a standard measure for bacterial species delimitation. However, its calculation can take orders of magnitude longer than fast similarity estimates based on sampling of short nucleotides, compiled into so-called sketches. Though these estimates correlate well with ANI, they might miss some of what ANI would produce. We therefore compared the results of two fast programs, mash and dashing, against ANI, in delimiting species among publicly available Esterobacterales genomes. Receiver Operating Characteristic (ROC) curve analysis found that all three programs were highly accurate in species delimitation, with Area Under the Curve (AUC) values above 0.99, indicating almost perfect species discrimination. After sub-sampling from over-represented species, the AUC lowered to 0.94 with all three methods. In focused tests with ten genera represented by more than two species, all measures showed, again, almost identical results, with Shigella showing the lowest AUC values (0.68), followed by Citrobacter (0.79) and Enterobacter (0.91). The remaining genera, Dickeya, Escherichia, Klebsiella, Pectobacterium, Proteus, Providencia and Yersinia, produced AUC values above 0.97. Genome distance ranges varied among the species of these genera. The E. coli+Shigella group remains challenging to separate. The group had AUC values lower than 0.6 with all measures tested. Overall, our results suggest that fast estimates of genome similarity are as good as ANI for species delimitation. Therefore, these fast estimates might suffice for determining the role of genomic similarity in bacterial taxonomy.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC