Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.03.20.533501v1?rss=1
Authors: Malbranke, C., Rostain, W., Depardieu, F., Cocco, S., Monasson, R., Bikard, D.
Abstract: We present here an approach to protein design that combines evolutionary and physics-grounded modeling. Using a Restricted Boltzmann Machine, we learned a sequence model of a protein family and propose a strategy to explore the protein representation space that can be informed by external models such as an empirical force field method (FoldX). This method was applied to a domain of the Cas9 protein responsible for recognition of a short DNA motif. We experimentally assessed the functionality of 71 variants that were generated to explore a range of RBM and FoldX energies. We show how a combination of structural and evolutionary information can identify functional variants with high accuracy. Sequences with as many as 50 differences (20% of the protein domain) to the wild-type retained functionality. Interestingly, some sequences (6/71) produced by our method showed an improved activity in comparison with the original wild-type proteins sequence. These results demonstrate the interest of further exploring the synergies between machine-learning of protein sequence representations and physics grounded modeling strategies informed by structural information.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC