cover of episode Uni-Fold MuSSe: De Novo Protein Complex Prediction with Protein Language Models

Uni-Fold MuSSe: De Novo Protein Complex Prediction with Protein Language Models

2023/2/15
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.02.14.528571v1?rss=1

Authors: Zhu, J., He, Z., Li, Z., Ke, G., Zhang, L.

Abstract: Accurately solving the structures of protein complexes is crucial for understanding and further modifying biological activities. Recent success of AlphaFold and its variants shows that deep learning models are capable of accurately predicting protein complex structures, yet with the painstaking effort of homology search and pairing. To bypass this need, we present Uni-Fold MuSSe (Multimer with Single Sequence inputs), which predicts protein complex structures from their primary sequences with the aid of pre-trained protein language models. Specifically, we built protein complex prediction models based on the protein sequence representations of ESM-2, a large protein language model with 3 billion parameters. In order to adapt the language model to inter-protein evolutionary patterns, we slightly modified and further pre-trained the language model on groups of protein sequences with known interactions. Our results highlight the potential of protein language models for complex prediction and suggest room for improvements.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC