Results 1 - 10
of
294
Global landscape of protein complexes in the yeast Saccharomyces cerevisiae.
- Nature
, 2006
"... Identification of protein-protein interactions often provides insight into protein function, and many cellular processes are performed by stable protein complexes. We used tandem affinity purification to process 4,562 different tagged proteins of the yeast Saccharomyces cerevisiae. Each preparation ..."
Abstract
-
Cited by 296 (9 self)
- Add to MetaCart
Identification of protein-protein interactions often provides insight into protein function, and many cellular processes are performed by stable protein complexes. We used tandem affinity purification to process 4,562 different tagged proteins of the yeast Saccharomyces cerevisiae. Each preparation was analysed by both matrix-assisted laser desorption/ ionization-time of flight mass spectrometry and liquid chromatography tandem mass spectrometry to increase coverage and accuracy. Machine learning was used to integrate the mass spectrometry scores and assign probabilities to the protein-protein interactions. Among 4,087 different proteins identified with high confidence by mass spectrometry from 2,357 successful purifications, our core data set (median precision of 0.69) comprises 7,123 protein-protein interactions involving 2,708 proteins. A Markov clustering algorithm organized these interactions into 547 protein complexes averaging 4.9 subunits per complex, about half of them absent from the MIPS database, as well as 429 additional interactions between pairs of complexes. The data (all of which are available online) will help future studies on individual proteins as well as functional genomics and systems biology.
Inferring Networks of Diffusion and Influence
, 2010
"... Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in ..."
Abstract
-
Cited by 116 (13 self)
- Add to MetaCart
Information diffusion and virus propagation are fundamental processes talking place in networks. While it is often possible to directly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and in practice gives provably near-optimal performance. We demonstrate the effectiveness of our approach by tracing information cascades in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.
Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps
- BIOINFORMATICS, VOL. 21 SUPPL. 1 2005, PAGES I302–I310
, 2005
"... ..."
(Show Context)
A mitochondrial protein compendium elucidates complex I disease biology
- Cell
, 2008
"... Mitochondria are complex organelles whose dysfunction underlies a broad spectrum of human diseases. Identifying all of the proteins resident in this organelle and understanding how they integrate into pathways represent major challenges in cell biology. Toward this goal, we performed mass spectromet ..."
Abstract
-
Cited by 81 (5 self)
- Add to MetaCart
(Show Context)
Mitochondria are complex organelles whose dysfunction underlies a broad spectrum of human diseases. Identifying all of the proteins resident in this organelle and understanding how they integrate into pathways represent major challenges in cell biology. Toward this goal, we performed mass spectrometry, GFP tagging, and machine learning to create a mitochondrial compendium of 1098 genes and their protein expression across 14 mouse tissues. We link poorly characterized proteins in this inventory to known mitochondrial pathways by virtue of shared evolutionary history. Using this approach, we predict 19 proteins to be important for the function of complex I (CI) of the electron transport chain. We validate a subset of these predictions using RNAi, including C8orf38, which we further show harbors an inherited mutation in a lethal, infantile CI deficiency. Our results have important implications for understanding CI function and pathogenesis and, more generally, illustrate how our compendium can serve as a foundation for systematic investigations of mitochondria.
Random Forest Similarity for Protein-Protein Interaction Prediction
- Pac Symp Biocomput
, 2005
"... One of the most important, but often ignored, parts of any clustering and classification algorithm is the computation of the similarity matrix. This is especially important when integrating high throughput biological data sources because of the high noise rates and the many missing values. In this p ..."
Abstract
-
Cited by 68 (12 self)
- Add to MetaCart
(Show Context)
One of the most important, but often ignored, parts of any clustering and classification algorithm is the computation of the similarity matrix. This is especially important when integrating high throughput biological data sources because of the high noise rates and the many missing values. In this paper we present a new method to compute such similarities for the task of classifying pairs of proteins as interacting or not. Our method uses direct and indirect information about interaction pairs to constructs a random forest (a collection of decision tress) from a training set. The resulting forest is used to determine the similarity between protein pairs and this similarity is used by a classification algorithm (a modified kNN) to classify protein pairs. Testing the algorithm on yeast data indicates that it is able to improve coverage to 20 % of interacting pairs with a false positive rate of 50%. These results compare favorably with all previously suggested methods for this task indicating the importance of robust similarity estimates. 1
Assessing the limits of genomic data integration for predicting protein networks
, 2005
"... ..."
(Show Context)
Predicting protein complex membership using probabilistic network reliability
- Genome Res
, 2004
"... data ..."
(Show Context)
D: Global protein function annotation through mining genome-scale data in yeast Saccharomyces cerevisiae
- Nucleic Acids Res
"... As we are moving into the post genome-sequencing era, various high-throughput experimental techniques have been developed to characterize biological systems on the genomic scale. Discovering new biological knowledge from the high-throughput biological data is a major challenge to bioinformatics toda ..."
Abstract
-
Cited by 60 (0 self)
- Add to MetaCart
(Show Context)
As we are moving into the post genome-sequencing era, various high-throughput experimental techniques have been developed to characterize biological systems on the genomic scale. Discovering new biological knowledge from the high-throughput biological data is a major challenge to bioinformatics today. To address this challenge, we developed a Bayesian statistical method together with Boltzmann machine and simulated annealing for protein functional annotation in the yeast Saccharomyces cerevisiae through integrating various high-throughput biological data including yeast two-hybrid data, protein complexes, and microarray gene expression profiles. In our approach, we quantified the relationship between functional similarity and high-throughput data, and coded them into “functional linkage graph”, where each node represents one protein and the weight of each edge is characterized by the Bayesian probability of function similarity between two proteins. We also integrated the evolution information and protein subcellular localization information into the prediction. Based on our method, 1802 out of 2280 unannotated proteins in yeast were assigned functions systematically. 2 1