cover of episode Determining epitope specificity of T-cell receptors with transformers

Determining epitope specificity of T-cell receptors with transformers

2023/4/2
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.03.31.534974v1?rss=1

Authors: Khan, A. R., Reinders, M., Khatri, I.

Abstract: Motivation: T-cell receptors (TCR) on T cells recognize and bind to epitopes presented by the major histocompatibility complex (MHC) in case of an infection or cancer. However, the high diversity of TCRs, as well as their unique and complex binding mechanisms underlying epitope recognition, make it difficult to predict the binding between TCR and epitope. Here, we present the utility of transformers, a deep learning strategy that incorporates an attention mechanism that learns the informative features, and show that these models pretrained on a large set of protein sequences outperform current strategies. Method: We compared three pre-trained auto-encoder transformer models (ProtBERT, ProtAlbert, ProtElectra) and one pre-trained auto-regressive transformer model (ProtXLNet) to predict the binding specificity of TCRs to 25 epitopes from the VDJdb database (human and murine). Two additional modifications were performed to incorporate gene usage of the TCRs in the four transformer models. Results: Of all 12 transformer implementations (4 models with 3 different modifications), a modified version of the ProtXLNet model could predict TCR-epitope pairs with the highest accuracy (weighted F1 score 0.55 simultaneously considering all 25 epitopes). The modification included additional features representing the gene names for the TCRs. We also showed that the basic implementation of transformers outperformed the previously available methods, i.e. TCRGP, TCRDist and DeepTCR, developed for the same biological problem, especially for the hard-to-classify labels. Conclusion: We show that the proficiency of transformers in attention learning can indeed be made operational in a complex biological setting like TCR binding prediction. Further ingenuity in utilizing the full potential of transformers, either through attention head visualization or introducing additional features, can further extend T-cell research avenues. Availability: Data and code are available on https://github.com/InduKhatri/tcrformer

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC