Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.11.536445v1?rss=1
Authors: Ferreira, M., Wendering, P., Arend, M., Silveira, W., Nikoloski, Z.
Abstract: Quantification of how different environmental cues affect protein allocation can provide important insights for understanding cell physiology. While absolute quantification of proteins can be obtained by resource-intensive mass-spectrometry-based technologies, prediction of protein abundances offers another way to obtain insights into protein allocation. Yet, building machine learning models of high accuracy across diverse, sub-optimal conditions remains notoriously difficult due to the dynamic nature of protein allocation. Here we present CAMEL, a framework that couples constraint-based modelling with machine learning to predict protein abundance for any environmental condition. This is achieved by building machine learning models that leverage static features, derived from protein sequences, and condition-dependent features predicted from protein-constrained metabolic models. Our findings demonstrate that CAMEL results in excellent prediction of protein allocation in E. coli (average Pearson correlation of at least 0.9), and moderate performance in S. cerevisiae (average Pearson correlation of at least 0.5). Therefore, CAMEL outperformed contending approaches without using molecular read-outs from unseen conditions and provides a valuable tool for using protein allocation in biotechnological applications.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC