cover of episode Identification of Low-Complexity Domains by Compositional Signatures Reveals Class-Specific Frequencies and Functions Across Reference Proteomes

Identification of Low-Complexity Domains by Compositional Signatures Reveals Class-Specific Frequencies and Functions Across Reference Proteomes

2023/7/25
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.24.550260v1?rss=1

Authors: Cascarina, S. M., Ross, E. D.

Abstract: Low-complexity domains (LCDs) in proteins are typically enriched in one or two predominant amino acids. As a result, LCDs often exhibit unusual structural/biophysical tendencies and can occupy functional niches. However, for each organism, protein sequences must be compatible with intracellular biomolecules and physicochemical environment, both of which vary from organism to organism. This raises the possibility that LCDs may occupy sequence spaces in select organisms that are otherwise prohibited in most organisms. Here, we report a comprehensive survey of LCDs in all known reference proteomes ( greater than 21k organisms), with added focus on rare and unusual types of LCDs. LCDs were sorted into a rich hierarchical database according to both the primary amino acid and secondary amino acid in each LCD sequence, facilitating detailed comparisons of LCD class frequencies across organisms. Examination of LCD classes at different depths (i.e., domain of life, organism, protein, and per-residue levels) reveals unique facets of LCD frequencies and functions. To our surprise, all 400 LCD classes occur in nature, although some are exceptionally rare. A number of rare classes can be defined for each domain of life, with many LCD classes appearing to be eukaryote-specific. Multiple eukaryote-specific LCD classes could be linked to consistent sets of functions across organisms. Our analysis methods enable simultaneous, direct comparison of all LCD classes between individual organisms, resulting in a proteome-scale view of differences in LCD frequencies and functions. Together, these results highlight the remarkable diversity and functional specificity of LCDs across all known life forms.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC