cover of episode Reply to: The pitfalls of negative data bias for the T-cell epitope specificity challenge

Reply to: The pitfalls of negative data bias for the T-cell epitope specificity challenge

2023/4/17
logo of podcast PaperPlayer biorxiv bioinformatics

PaperPlayer biorxiv bioinformatics

Frequently requested episodes will be transcribed first

Shownotes Transcript

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.04.07.535967v1?rss=1

Authors: Gao, Y., Gao, Y., Dong, K., Wu, S., Liu, Q.

Abstract: We noticed that recently Pieter Meysman et al indicated the negative data sampling issue in T-cell epitope specificity prediction. In light of the limited data available in this area, the negative sampling issue is generally important for biological data modeling, since biological experimental tests intend to record the positive results while ignore the negative results. We appreciated their efforts to raise this point for T-cell epitope specificity modeling, which is known clearly by the community that different negative data sampling strategy will influence the prediction results. Therefore, proper negative data sampling strategy should be carefully selected, and this is exactly what PanPep has noticed, emphasized and performed. In short, as for the two commonly used negative sampling strategy, i.e., reshuffling based on positive pairs (first strategy) and randomly drawing from background repertories (second strategy), PanPep prefers to select the second strategy, and the rational has been clearly indicated in the manuscript. Now we would like to clarify this point further by formulating this problem as a PU learning and calling for more attentions on this point.

Copy rights belong to original authors. Visit the link for more info

Podcast created by Paper Player, LLC