A comparison between Greengenes, SILVA, RDP, and NCBI reference databases in four published microbiota datasets

2023/4/13

PaperPlayer biorxiv bioinformatics

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.12.535864v1?rss=1 Authors: Ceccarani, C., Severgnini, M. Abstract: Inaccurate bacterial taxonomic assignment in 16S-based microbiota experiments could have deleterious effects on research results, as all downstream analyses heavily rely on the accurate assessment of microbial taxonomy: a bias in the choice of the reference database can deeply alter microbiota biodiversity (alpha-diversity), composition (beta-diversity), and taxa profile (bacterial relative abundances). In this paper, we explored the influence of the reference 16S rRNA collection by performing a classification against four of the main databases used by the scientific community (i.e. Greengenes, SILVA, RDP, NCBI); the consequences of database clustering at 97% were also explored. To investigate the effects of the database choice on real and representative microbiome samples from different ecosystems, we performed a comparative analysis on four already published datasets from various sources: stools from a mouse model experiment, bovine milk, human gut microbiota stool samples, and swabs from the human vaginal environment. We took into consideration the computational time needed to perform the taxonomic classification as well. Although values in both alpha- and beta-diversity varied a lot, sometimes even statistically, according to the dataset chosen and the eventual clustering, the final outcome of the analysis was a concordance in the capability to retrieve the original experimental group differences over the various datasets. However, in the taxonomy classification, we found several inconsistencies with taxonomies correctly assigned in only some of the four databases. The degree of concordance among the databases was related to both the complexity of the environment and its degree of completeness in the reference databases. Copy rights belong to original authors. Visit the link for more info Podcast created by Paper Player, LLC

A comparison between Greengenes, SILVA, RDP, and NCBI reference databases in four published microbiota datasets 02:11 Share

PaperPlayer biorxiv bioinformatics

Shownotes Transcript

A comparison between Greengenes, SILVA, RDP, and NCBI reference databases in four published microbiota datasets