cover of episode A Weighted Two-stage Sequence Alignment Framework to Identify DNA Motifs from ChIP-exo Data

A Weighted Two-stage Sequence Alignment Framework to Identify DNA Motifs from ChIP-exo Data

2023/4/8
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.06.535915v1?rss=1

Authors: Li, Y., Wang, Y., Wang, C., Fennel, A., Ma, A., Jiang, J., Liu, Z., Ma, Q., Liu, B.

Abstract: Identifying precise transcription factor binding sites (TFBS) or regulatory DNA motifs plays a fundamental role in researching transcriptional regulatory mechanisms in cells and in helping construct regulatory networks. Current algorithms developed for motif searching focus on the analysis of ChIP-enriched peaks but are not able to integrate the ChIP signal in nucleotide resolution. We present a weighted two-stage alignment tool (TESA). Our framework implements an analysis workflow from experimental datasets to TFBS prediction results. It employs a binomial distribution model and graph searching model with ChIP-exonuclease (ChIP-exo) reads depth and sequence data. TESA can effectively measure the possibility for each position to be an actual TFBS in a given promoter sequence and predict statistically significant TFBS sequence segments. The algorithm substantially improves prediction accuracy and extends the scope of applicability of existing approaches. We apply the framework to a collection of 20 ChIP-exo datasets of E. coli from proChIPdb and evaluate the prediction performance through comparison with three existing programs. The performance evaluation against the compared programs indicates that TESA is more accurate for identifying regulatory motifs in prokaryotic genomes.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC