Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.07.05.547866v1?rss=1
Authors: Sanz Moreta, L.
Abstract: Calculating similarities among sequences (i.e biological sequences) can be a challenging task. Here I introduce Dromi, a simple python package that can compute different similarity measurements (i.e percent identity, cosine similarity, kmer similarities) across aligned vector-encoded sequences. This is a crucial step required to perform both upstream and downstream sequence machine learning tasks such as sequence clustering, sequence analysis and other pre- or post- processing demands on sequences. Additionally, this package introduces the calculation of the measure referred as positional weights. These represent the cosine similarities or residue-conservation across sequence elements (i.e amino acids in peptide sequences) in the same site (column). The program can also deal with sequences of variable length since end-padded positions are not considered for the calculations. The presented implementations are an incorporation into the arsenal of tools to measure similarity among small peptide sequences such as epitopes.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC