Results 1  10
of
22
Topologyfree querying of protein interaction networks
 In Proceedings of 13th RECOMB
, 2009
"... Abstract. In the network querying problem, one is given a protein complex or pathway of species A and a protein–protein interaction network of species B; the goal is to identify subnetworks of B that are similar to the query. Existing approaches mostly depend on knowledge of the interaction topology ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Abstract. In the network querying problem, one is given a protein complex or pathway of species A and a protein–protein interaction network of species B; the goal is to identify subnetworks of B that are similar to the query. Existing approaches mostly depend on knowledge of the interaction topology of the query in the network of species A; however, in practice, this topology is often not known. To combat this problem, we develop a topologyfree querying algorithm, which we call Torque. Given a query, represented as a set of proteins, Torque seeks a matching set of proteins that are sequencesimilar to the query proteins and span a connected region of the network, while allowing both insertions and deletions. The algorithm uses alternatively dynamic programming and integer linear programming for the search task. We test Torque with queries from yeast, fly, and human, where we compare it to the QNet topologybased approach, and with queries from less studied species, where only topologyfree algorithms apply. Torque detects many more matches than QNet, while in both cases giving results that are highly functionally coherent. 1
Parameterized Algorithms and Hardness Results for Some Graph Motif Problems
"... Abstract. We study the NPcomplete Graph Motif problem: given a vertexcolored graph G = (V, E) and a multiset M of colors, does there exist an S ⊆ V such that G[S] is connected and carries exactly (also with respect to multiplicity) the colors in M? We present an improved randomized algorithm for G ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Abstract. We study the NPcomplete Graph Motif problem: given a vertexcolored graph G = (V, E) and a multiset M of colors, does there exist an S ⊆ V such that G[S] is connected and carries exactly (also with respect to multiplicity) the colors in M? We present an improved randomized algorithm for Graph Motif with running time O(4.32 M  · M  2 · E). We extend our algorithm to listcolored graph vertices and the case where the motif G[S] needs not be connected. By way of contrast, we show that extending the request for motif connectedness to the somewhat “more robust ” motif demands of biconnectedness or bridgeconnectedness leads to W[1]complete problems. Actually, we show that the presumably simpler problems of finding (uncolored) biconnected or bridgeconnected subgraphs are W[1]complete with respect to the subgraph size. Answering an open question from the literature, we further show that the parameter “number of connected motif components ” leads to W[1]hardness even when restricted to graphs that are paths. 1
SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs
"... With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occurrences of a query graph in a large database graph is an important research topic. Due to the existence of noise (e.g., mi ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occurrences of a query graph in a large database graph is an important research topic. Due to the existence of noise (e.g., missing edges) in the large database graph, we investigate the problem of approximate subgraph indexing, i.e., finding the occurrences of a query graph in a large database graph with (possible) missing edges. The SAPPER method is proposed to solve this problem. Utilizing the hybrid neighborhood unit structures in the index, SAPPER takes advantage of pregenerated random spanning trees and a carefully designed graph enumeration order. Real and synthetic data sets are employed to demonstrate the efficiency and scalability of our approximate subgraph indexing method.
Counting Stars and Other Small Subgraphs in Sublinear Time
"... Detecting and counting the number of copies of certain subgraphs (also known as network motifs or graphlets), is motivated by applications in a variety of areas ranging from Biology to the study of the WorldWideWeb. Several polynomialtime algorithms have been suggested for counting or detecting t ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Detecting and counting the number of copies of certain subgraphs (also known as network motifs or graphlets), is motivated by applications in a variety of areas ranging from Biology to the study of the WorldWideWeb. Several polynomialtime algorithms have been suggested for counting or detecting the number of occurrences of certain network motifs. However, a need for more efficient algorithms arises when the input graph is very large, as is indeed the case in many applications of motif counting. In this paper we design sublineartime algorithms for approximating the number of copies of certain constantsize subgraphs in a graph G. That is, our algorithms do not read the whole graph, but rather query parts of the graph. Specifically, we consider algorithms that may query the degree of any vertex of their choice and may ask for any neighbor of any vertex of their choice. The main focus of this work is on the basic problem of counting the number of length2 paths and more generally on counting the number of stars of a certain size. Specifically, we design an algorithm that, given an approximation parameter 0 < ɛ < 1 and query access to a graph G, outputs an estimate ˆνs such that with high constant probability, (1−ɛ)νs(G) ≤ ˆνs ≤ (1+ɛ)νs(G), where νs(G) denotes the number of stars of size s + 1 in the graph. The expected query ( complexity and { running time of}) the algorithm are O
Algorithm Engineering for ColorCoding with Applications to Signaling Pathway Detection
, 2007
"... Colorcoding is a technique to design fixedparameter algorithms for several NPcomplete subgraph isomorphism problems. Somewhat surprisingly, not much work has so far been spent on the actual implementation of algorithms that are based on colorcoding, despite the elegance of this technique and its ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Colorcoding is a technique to design fixedparameter algorithms for several NPcomplete subgraph isomorphism problems. Somewhat surprisingly, not much work has so far been spent on the actual implementation of algorithms that are based on colorcoding, despite the elegance of this technique and its wide range of applicability to practically important problems. This work gives various novel algorithmic improvements for colorcoding, both from a worstcase perspective as well as under practical considerations. We apply the resulting implementation to the identification of signaling pathways in protein interaction networks, demonstrating that our improvements speed up the colorcoding algorithm by orders of magnitude over previous implementations. This allows more complex and larger structures to be identified in reasonable time; many biologically relevant instances can even be solved in seconds where, previously, hours were required.
GraMoFoNe: a Cytoscape plugin for querying motifs without topology in ProteinProtein Interactions networks
 In 2nd International Conference on Bioinformatics and Computational Biology (BICoB’10
, 2010
"... During the last decade, data on ProteinProtein Interactions (PPI) has increased in a huge manner. Searching for motifs in PPI Network has thus became a crucial problem to interpret this data. A large part of the literature is devoted to the query of motifs with a given topology. However, the biolog ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
During the last decade, data on ProteinProtein Interactions (PPI) has increased in a huge manner. Searching for motifs in PPI Network has thus became a crucial problem to interpret this data. A large part of the literature is devoted to the query of motifs with a given topology. However, the biological data are, by now, so noisy (missing and erroneous information) that the topology of a motif can be unrelevant. Consequently, Lacroix et al. [19] defined a new problem, called GRAPH MOTIF, which consists in searching a multiset of colors in a vertexcolored graph. In this article, we present GraMoFoNe, a plugin to Cytoscape based on a Linear PseudoBoolean optimization solver which handles GRAPH MOTIF and some of its extensions. 1.
Approximating the number of Network Motifs
"... Abstract. World Wide Web, the Internet, coupled biological and chemical systems, neural networks, and social interacting species, are only a few examples of systems composed by a large number of highly interconnected dynamical units. These networks contain characteristic patterns, termed network mot ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract. World Wide Web, the Internet, coupled biological and chemical systems, neural networks, and social interacting species, are only a few examples of systems composed by a large number of highly interconnected dynamical units. These networks contain characteristic patterns, termed network motifs, which occur far more often than in randomized networks with the same degree sequence. Several algorithms have been suggested for counting or detecting the number of induced or noninduced occurrences of network motifs in the form of trees and bounded treewidth subgraphs of size O(log n), and of size at most 7 for some motifs. In addition, counting the number of motifs a node is part of was recently suggested as a method to classify nodes in the network. The promise is that the distribution of motifs a node participate in is an indication of its function in the network. Therefore, counting the number of network motifs anodeispartofprovides a major challenge. However, no such practical algorithm exists. We present several algorithms with time complexity O ( e 2k k · n ·E·log 1 δ /ɛ2) that, for the first time, approximate for every vertex the number of noninduced occurrences of the motif the vertex is part of, for klength cycles, klength cycles with a chord, and (k − 1)length paths, where k = O(log n), and for all motifs of size of at most four. In addition, we show algorithms that approximate the total number of noninduced occurrences of these network motifs, when no efficient algorithm exists. Some of our algorithms use the color coding technique.
Parameterized Algorithmics for Finding Connected Motifs in Biological Networks
 IEEE TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
"... We study the NPhard LISTCOLORED GRAPH MOTIF problem which, given an undirected listcolored graph G = (V, E) and a multiset M of colors, asks for maximumcardinality sets S ⊆ V and M ′ ⊆ M such that G[S] is connected and contains exactly (with respect to multiplicity) the colors in M ′. LISTCOLO ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We study the NPhard LISTCOLORED GRAPH MOTIF problem which, given an undirected listcolored graph G = (V, E) and a multiset M of colors, asks for maximumcardinality sets S ⊆ V and M ′ ⊆ M such that G[S] is connected and contains exactly (with respect to multiplicity) the colors in M ′. LISTCOLORED GRAPH MOTIF has applications in the analysis of biological networks. We study LISTCOLORED GRAPH MOTIF with respect to three different parameterizations. For the parameters motif size M  and solution size S  we present fixedparameter algorithms, whereas for the parameter V −M  we show W[1]hardness for general instances and achieve fixedparameter tractability for a special case of LISTCOLORED GRAPH MOTIF. We implemented the fixedparameter algorithms for parameters M  and S, developed further speedup heuristics for these algorithms, and applied them in the context of querying proteininteraction networks, demonstrating their usefulness for realistic instances. Furthermore, we show that extending the request for motif connectedness to stronger demands such as biconnectedness or bridgeconnectedness leads to W[1]hard problems when the parameter is the motif size M.
Querying Graphs in ProteinProtein Interactions Networks using Feedback Vertex Set
"... Recent techniques increase rapidly the amount of our knowledge on interactions between proteins. The interpretation of these new information depends on our ability to retrieve known substructures in the data, the ProteinProtein Interactions (PPI) networks. In an algorithmic point of view, it is an ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Recent techniques increase rapidly the amount of our knowledge on interactions between proteins. The interpretation of these new information depends on our ability to retrieve known substructures in the data, the ProteinProtein Interactions (PPI) networks. In an algorithmic point of view, it is an hard task since it often leads to NPhard problems. To overcome this difficulty, many authors have provided tools for querying patterns with a restricted topology, i.e. paths or trees in PPI networks. Such restriction leads to the development of fixed parameter tractable (FPT) algorithms, which can be practicable for restricted sizes of queries. Unfortunately, GRAPH HOMOMORPHISM is a W[1]hard problem, and hence, no FPT algorithm can be found when patterns are in the shape of general graphs. However, Dost et al. [2] gave an algorithm (which is not implemented) to query graphs with a bounded treewidth in PPI networks (the treewidth of the query being involved in the time complexity). In this paper, we propose another algorithm for querying pattern in the shape of graphs, also based on dynamic programming and the colorcoding technique. To transform graphs queries into trees without loss of informations, we use feedback vertex set coupled to a node duplication mecanism. Hence, our algorithm is FPT for querying graphs with a bounded size of their feedback vertex set. It gives an alternative to the treewidth parameter, which can be better or worst for a given query. We provide a python implementation which allows us to validate our implementation on real data. Especially, we retrieve some human queries in the shape of graphs into the fly PPI network.