Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.11.548644v1?rss=1
Authors: Ghazikhani, H., Butler, G.
Abstract: This study presents TooT-PLM-ionCT, a composite framework consisting of three distinct systems, each with different architectures and trained on unique datasets. Each system within TooT-PLM-ionCT is dedicated to a specific task: segregating ion channels (ICs) and ion transporters (ITs) from other membrane proteins and differentiating ICs from ITs. These systems exploit the capabilities of six diverse Protein Language Models (PLMs) - ProtBERT, ProtBERT-BFD, ESM-1b, ESM-2 (650M parameters), and ESM-2 (15B parameters). As these proteins play a pivotal role in the regulation of ion movement across cellular membranes, they are integral to numerous biological processes and overall cellular vitality. To circumvent the costly and time-consuming nature of wet lab experiments, we harness the predictive prowess of PLMs, drawing parallels with techniques in natural language processing. Our strategy engages six classifiers, embracing both conventional methodologies and a deep learning model, for each of our defined tasks. Furthermore, we delve into critical factors influencing our tasks, including the implications of dataset balancing, the effect of frozen versus fine-tuned PLM representations, and the potential variance between half and full precision floating-point computations. Our empirical results showcase superior performance in distinguishing ITs from other membrane proteins and differentiating ICs from ITs, while the task of discriminating ICs from other membrane proteins exhibits results commensurate with the current state-of-the-art.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC