Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.12.548726v1?rss=1
Authors: Ramakrishnan, P., Bromberg, Y.
Abstract: In silico functional annotation of proteins is crucial to narrowing the sequencing-accelerated gap in our understanding of protein activities. Numerous function annotation methods exist, and their ranks have been growing, particularly so with the recent deep learning-based developments. However, it is unclear if these tools are truly predictive as most do not identify new terms in functional ontologies. We thus ask, can they identify sequences capable of carrying out known molecular functions but are non-homologous to or far-removed from known protein families? Here, we explore the potential and limitations of the existing methods in predicting molecular functions of thousands of such orphan proteins. Lacking the ground truth functional annotations, we translated the assessment of the function prediction into evaluation of functional similarity of orphan siblings. Notably, our approach transcends the limitations of functional annotation vocabularies and provides a platform to compare different methods without the need for mapping terms across ontologies. We find that most existing methods are limited to homology-based annotations and are thus descriptive, rather than predictive of function. Curiously, despite their seemingly unlimited by homology scope, novel deep learning methods remain far from capturing functional signal encoded in protein sequence. We believe that our work will inspire the development of a new generation of methods that push our knowledge boundaries and promote exploration and discovery in the molecular function domain.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC