cover of episode Simulations of sequence evolution: how (un)realistic they really are and why

Simulations of sequence evolution: how (un)realistic they really are and why

2023/7/12
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.11.548509v1?rss=1

Authors: Trost, J., Haag, J., Hoehler, D., Nesterenko, L., Jacob, L., Stamatakis, A., Boussau, B.

Abstract: Motivation: Simulating sequence evolution plays an important role in the development and evaluation of phylogenetic inference tools. Naturally, the simulated data needs to be as realistic as possible to be indicative of the performance of the developed tools on empirical data. Over the years, numerous phylogenetic sequence simulators, employing various models of evolution, have been published with the goal to simulate such empirical-like data. In this study, we simulated DNA and protein Multiple Sequence Alignments (MSAs) under increasingly complex models of evolution with and without insertion/deletion (indel) events using a state-of-the-art sequence simulator. We assessed their realism by quantifying how well supervised learning methods are able to predict whether a given MSA is simulated or empirical. Results: Our results show that we can distinguish between empirical and simulated MSAs with high accuracy using two distinct and independently developed classification approaches across all tested models of sequence evolution. Our findings suggest that the current state-of-the-art models fail to accurately replicate the process of evolution.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC