Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.02.17.528962v1?rss=1
Authors: Ammar, C., Schessner, J. P., Willems, S., Michaelis, A. C., Mann, M.
Abstract: Recent advances in mass spectrometry (MS)-based proteomics enable the acquisition of increasingly large datasets within relatively short times, which exposes bottlenecks in the bioinformatics pipeline. Whereas peptide identification is already scalable, most label-free quantification (LFQ) algorithms scale quadratic or cubic with the sample numbers, which may even preclude the analysis of large-scale data. Here we introduce directLFQ, a ratio-based approach for sample normalization and the calculation of protein intensities. It estimates quantities via aligning samples and ion traces by shifting them on top of each other in logarithmic space. Importantly, directLFQ scales linearly with the number of samples, allowing analyses of large studies to finish in minutes instead of days or months. We quantify 10,000 proteomes in 10 minutes and 100,000 proteomes in less than two hours - thousand-fold faster than some implementations of the popular LFQ algorithm MaxLFQ. In-depth characterization of directLFQ reveals excellent normalization properties and benchmark results, comparing favorably to MaxLFQ for both data-dependent acquisition (DDA) and data-independent acquisition (DIA). Additionally, directLFQ provides normalized peptide intensity estimates for peptide-level comparisons. It is available as an open-source Python package and as a GUI with a one-click installer and can be used in the AlphaPept ecosystem as well as downstream of most common computational proteomics pipelines.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC