Séminaire de Probabilités et Statistique :
Le 06 février 2017 à 13:45 - UM - Bât 09 - Salle de conférence (1er étage)
Présentée par Jacob Laurent - Université de Lyon I
More powerful differential analysis of relative quantitative proteomics data by leveraging shared peptides
One of the most classical pipelines in discovery proteomic experiments involves an enzymatic digestion of the proteins into peptides followed by the identification and quantification of the latter by mass spectrometry. Due to numerous homology sequences, different proteins can lead to peptides with identical amino acid chains, so that their parent protein is ambiguous. These so-called shared peptides make the protein-level statistical analysis more challenging and are often discarded. In this article, we propose to use a likelihood ratio test of protein differential abundance based on a linear model which account for shared-peptides. We show that on a problem with \(n\) samples and \(q\) peptides our statistic can be computed in \(O(nq)\). We also provide an asymptotic null distribution for a version of our statistic based on a regularized maximum likelihood estimator of the variance in the linear model. Our test outperforms state-of-the-art methods in terms of precision-recall on both real and simulated datasets. Finally, an R package is provided for direct application by proteomic practitionners.