Results 1 -
4 of
4
Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data
, 2006
"... This paper addresses a central problem of Proteomics: estimating the amounts of each of the thousands of proteins in a cell culture or tissue sample. Although laboratory methods involving isotopes have been developed for this problem, we seek a method that uses simpler laboratory procedures. Specifi ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper addresses a central problem of Proteomics: estimating the amounts of each of the thousands of proteins in a cell culture or tissue sample. Although laboratory methods involving isotopes have been developed for this problem, we seek a method that uses simpler laboratory procedures. Specifically, our aim is to use data-mining techniques to infer protein levels from the relatively cheap and abundant data available from high-throughput tandem mass spectrometry (MS/MS). We have developed and evaluated several techniques for tackling this problem, including the development of three generative models of MS/MS data, and methods for efficiently fitting the models to data. In addition, we tested each method on three real-world datasets generated by MS/MS experiments performed on various tissue samples taken from Mouse. This paper outlines the biological
Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry
, 2004
"... Proteomics—the direct analysis of the expressed protein components of a cell—is critical to our understanding of cellular biological processes. Key insights into the action and effects of a disease can be obtained by comparison of the expression of the expressed proteins in normal versus diseased t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Proteomics—the direct analysis of the expressed protein components of a cell—is critical to our understanding of cellular biological processes. Key insights into the action and effects of a disease can be obtained by comparison of the expression of the expressed proteins in normal versus diseased tissue. Tandem mass spectrometry(MS/MS) of peptides is a central technology for Proteomics, enabling the identification of thousands of peptides from a complex mixture. With the increasing acquisition rate of tandem mass spectrometers, there is an increasing potential to solve important biological problems by applying data-mining and machine-learning techniques to MS/MS data. These problems include � estimating the levels of the thousands of proteins in a tissue sample, � � predicting the intensity of the peaks in a mass spectrum, and ��� explaining why different peptides from the same protein have different peak intensities. In other works, we have focussed on the first two problems. In this paper, we focus on the last problem. In particular, we try to explain why some peptides produce peaks of great intensity, while others produce peaks of low intensity, and we treat this as a classification problem. That is, we experimentally evaluate and compare a variety of discrimination methods for classifying peptides into those that produce high-intensity peaks and those that produce low-intensity peaks. The methods considered include K-Nearest Neighbours
Canonical Correlation, an Approximation, and the Prediction of Protein Abundance
"... This paper addresses a central problem of Bioinformatics and Proteomics: estimating the amounts of each of the thousands of proteins in a cell culture or tissue sample. Although laboratory methods involving isotopes have been developed for this problem, we seek a simpler method, one that uses fewer ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper addresses a central problem of Bioinformatics and Proteomics: estimating the amounts of each of the thousands of proteins in a cell culture or tissue sample. Although laboratory methods involving isotopes have been developed for this problem, we seek a simpler method, one that uses fewer laboratory procedures. Specifically, our aim is to use data-mining methods to infer protein levels from the relatively cheap and abundant data available from high-throughput tandem mass spectrometry (MS/MS). In this paper, we develop and evaluate a method for tackling this problem. The method is based on a simple generative model of MS/MS data. We first show how to linearize the model and fit it to data using Canonical Correlation Analysis (CCA). Then, because CCA is computationally expensive for the large datasets we are dealing with, we develop an efficient approximation of CCA, one that exploits the structure of our data. We prove that the method is correct in that it achieves a well-defined optimization criterion. We also evaluate the method on several biological datasets. The datasets themselves were generated by MS/MS experiments performed on various tissue samples taken from Mouse.
Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data ∗
"... This paper addresses a central problem of Proteomics: estimating the amounts of each of the thousands of proteins in a cell culture or tissue sample. Although laboratory methods involving isotopes have been developed for this problem, we seek a method that uses simpler laboratory procedures. Specifi ..."
Abstract
- Add to MetaCart
This paper addresses a central problem of Proteomics: estimating the amounts of each of the thousands of proteins in a cell culture or tissue sample. Although laboratory methods involving isotopes have been developed for this problem, we seek a method that uses simpler laboratory procedures. Specifically, our aim is to use data-mining techniques to infer protein levels from the relatively cheap and abundant data available from high-throughput tandem mass spectrometry (MS/MS). We have developed and evaluated several techniques for tackling this problem, including the development of three generative models of MS/MS data, and methods for efficiently fitting the models to data. In addition, we tested each method on three real-world datasets generated by MS/MS experiments performed on various tissue samples

