Results 11  20
of
205
An ensemble framework for clustering proteinprotein interaction networks
 In Proc. 15th Annual Int’l Conference on Intelligent Systems for Molecular Biology (ISMB
, 2007
"... ProteinProtein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
ProteinProtein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. In this paper, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topologybased distance metrics to counteract the effects of noise. We develop a PCAbased consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topologybased, information theoretic and domainspecific validation metrics and show that our approaches can provide significant benefits over other stateoftheart approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can a) produce improved biologically significant functional groupings; and b) facilitate soft clustering by discovering multiple functional associations for proteins. 1.
Conserved network motifs allow protein–protein interaction prediction
, 2004
"... Motivation: Highthroughput protein interaction detection methods are strongly affected by false positive and false negative results. Focused experiments are needed to complement the largescale methods by validating previously detected interactions but it is often difficult to decide which protei ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
Motivation: Highthroughput protein interaction detection methods are strongly affected by false positive and false negative results. Focused experiments are needed to complement the largescale methods by validating previously detected interactions but it is often difficult to decide which proteins to probe as interaction partners. Developing reliable computational methods assisting this decision process is a pressing need in bioinformatics. Results: We show that we can use the conserved properties of the protein network to identify and validate interaction candidates. We apply a number of machine learning algorithms to the protein connectivity information and achieve a surprisingly good overall performance in predicting interacting proteins. Using a ‘leaveoneout ’ approach we find average success rates between 20 and 40 % for predicting the correct interaction partner of a protein. We demonstrate that the success of these methods is based on the presence of conserved interaction motifs within the network. Availability: A reference implementation and a table with candidate interacting partners for each yeast protein are available
Learning to predict protein–protein interactions from protein sequences
 Bioinformatics
, 2003
"... In order to understand the molecular machinery of the cell, we need to know about the multitude of protein–protein interactions that allow the cell to function. Highthroughput technologies provide some data about these interactions, but so far that data is fairly noisy. Therefore, computational tec ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
In order to understand the molecular machinery of the cell, we need to know about the multitude of protein–protein interactions that allow the cell to function. Highthroughput technologies provide some data about these interactions, but so far that data is fairly noisy. Therefore, computational techniques for predicting protein–protein interactions could be of significant value. One approach to predicting interactions in silico is to produce from first principles a detailed model of a candidate interaction. We take an alternative approach, employing a relatively simple model that learns dynamically from a large collection of data. In this work, we describe an attraction– repulsion model, in which the interaction between a pair of proteins is represented as the sum of attractive and repulsive forces associated with small, domain or motifsized features along the length of each protein. The model is discriminative, learning simultaneously from known interactions and from pairs of proteins that are known (or suspected) not to interact. The model is efficient to compute and scales well to very large collections of data. In a crossvalidated comparison using known yeast interactions, the attraction–repulsion method performs better than several competing techniques. Contact:
Some protein interaction data do not exhibit power law statistics
 FEBS Lett
, 2005
"... It has been claimed that proteinprotein interaction (PPI) networks are scalefree based on the observation that the node degree sequence follows a power law. Here we argue that these claims are likely to be based on erroneous statistical analysis. Typically, the supporting data are presented using ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
It has been claimed that proteinprotein interaction (PPI) networks are scalefree based on the observation that the node degree sequence follows a power law. Here we argue that these claims are likely to be based on erroneous statistical analysis. Typically, the supporting data are presented using frequencydegree plots. We show that such plots can be misleading, and should correctly be replaced by rankdegree plots. We provide two PPI network examples in which the frequencydegree plots appear linear on a loglog scale, but the rankdegree plots demonstrate that the node degree sequence is far from a power law. We conclude that at least these PPI networks are not scalefree. Keywords: Proteinprotein interaction (PPI) networks, node degree sequence, power law, rankdegree plot. List of abbreviations PPI: Proteinprotein interaction, SF: Scalefree, SR: Scalerich
TopNet: a tool for comparing biological subnetworks, correlating protein properties with topological statistics
, 2004
"... ..."
Biological network comparison using graphlet degree distribution
 Bioinformatics
"... Motivation: Analogous to biological sequence comparison, comparing cellular networks is an important problem that could provide insight into biological understanding and therapeutics. For technical reasons, comparing large networks is computationally infeasible, and thus heuristics, such as the degr ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
Motivation: Analogous to biological sequence comparison, comparing cellular networks is an important problem that could provide insight into biological understanding and therapeutics. For technical reasons, comparing large networks is computationally infeasible, and thus heuristics, such as the degree distribution, clustering coefficient, diameter, and relative graphlet frequency distribution have been sought. It is easy to demonstrate that two networks are different by simply showing a short list of properties in which they differ. It is much harder to show that two networks are similar, as it requires demonstrating their similarity in all of their exponentially many properties. Clearly, it is computationally prohibitive to analyze all network properties, but the larger the number of constraints we impose in determining network similarity, the more likely it
Approximating Betweenness Centrality
, 2007
"... Betweenness is a centrality measure based on shortest paths, widely used in complex network analysis. It is computationallyexpensive to exactly determine betweenness; currently the fastestknown algorithm by Brandes requires O(nm) time for unweighted graphs and O(nm + n 2 log n) time for weighted ..."
Abstract

Cited by 25 (5 self)
 Add to MetaCart
Betweenness is a centrality measure based on shortest paths, widely used in complex network analysis. It is computationallyexpensive to exactly determine betweenness; currently the fastestknown algorithm by Brandes requires O(nm) time for unweighted graphs and O(nm + n 2 log n) time for weighted graphs, where n is the number of vertices and m is the number of edges in the network. These are also the worstcase time bounds for computing the betweenness score of a single vertex. In this paper, we present a novel approximation algorithm for computing betweenness centrality of a given vertex, for both weighted and unweighted graphs. Our approximation algorithm is based on an adaptive sampling technique that significantly reduces the number of singlesource shortest path computations for vertices with high centrality. We conduct an extensive experimental study on realworld graph instances, and observe that our random sampling algorithm gives very good betweenness approximations for biological networks, road networks and web crawls.
A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets
, 2009
"... We present a new lockfree parallel algorithm for computing betweenness centrality of massive complex networks that achieves better spatial locality compared with previous approaches. Betweenness centrality is a key kernel in analyzing the importance of vertices (or edges) in applications ranging fr ..."
Abstract

Cited by 24 (7 self)
 Add to MetaCart
We present a new lockfree parallel algorithm for computing betweenness centrality of massive complex networks that achieves better spatial locality compared with previous approaches. Betweenness centrality is a key kernel in analyzing the importance of vertices (or edges) in applications ranging from social networks, to power grids, to the influence of jazz musicians, and is also incorporated into the DARPA HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging highperformance computing architectures for graph analytics. We design an optimized implementation of betweenness centrality for the massively multithreaded Cray XMT system with the Threadstorm processor. For a smallworld network of 268 million vertices and 2.147 billion edges, the 16processor XMT system achieves a TEPS rate (an algorithmic performance count for the number of edges traversed per second) of 160 million per second, which corresponds to more than a 2 × performance improvement over the previous parallel implementation. We demonstrate the applicability of our implementation to analyze massive realworld datasets by computing approximate betweenness centrality for the large IMDb movieactor network. 1.
Graph theory and networks in biology
 IET Systems Biology, 1:89 – 119
, 2007
"... In this paper, we present a survey of the use of graph theoretical techniques in Biology. In particular, we discuss recent work on identifying and modelling the structure of biomolecular networks, as well as the application of centrality measures to interaction networks and research on the hierarch ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
In this paper, we present a survey of the use of graph theoretical techniques in Biology. In particular, we discuss recent work on identifying and modelling the structure of biomolecular networks, as well as the application of centrality measures to interaction networks and research on the hierarchical structure of such networks and network motifs. Work on the link between structural network properties and dynamics is also described, with emphasis on synchronization and disease propagation. 1
Simulation of robustness against lesions of cortical networks
 EUROPEAN JOURNAL OF NEUROSCIENCE, PP. 1–8
, 2007
"... ..."