cover of episode Standardizing and applying a mating-based whole-genome simulation approach reveals caution in using chromosome-level PCA and kinship estimates

Standardizing and applying a mating-based whole-genome simulation approach reveals caution in using chromosome-level PCA and kinship estimates

2023/3/16
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.03.16.532885v1?rss=1

Authors: Cui, Z., Schumacher, F.

Abstract: This paper presents a new and efficient method for simulating pseudo-genotype data using the standardized protocol of SLiM, which offers a flexible alternative to traditional methods that rely on large genetic datasets. These datasets can be time-consuming to obtain, especially when institutional review board (IRB) review is involved, making simulation an attractive alternative. While HapGen v2 is the most popular genotype simulator, we found that SLiM has the potential for more customizable simulation to meet multiple needs. To validate our new method, we compared its performance among parallel simulations varying multiple parameters. Our results showed that SLiM is capable of simulating samples up to 333 times the input size, with a low rate of simulated samples that are 2nd or closer relatives (REV), making it a promising alternative to HapGen. We also applied our whole-genome simulation approach to sensitivity analyses of chromosome-level principal component analysis (PCA) and kinship estimation. Our findings revealed important insights into the sensitivity of PCA and kinship estimation, highlighting the unequal distribution of population structure across chromosomes and ancestries. Furthermore, our study provides experimental support for avoiding chromosome-level quality control statistics. Overall, our standardized protocol of SLiM offers a flexible new way to produce pseudo-genotype data, and our findings provide valuable insights that can advance research in the field. By demonstrating the potential of SLiM for more customizable simulations and highlighting the importance of considering the distribution of population structure across chromosomes and ancestries, our research has significant implications for the study of genetics and genomics.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC