Results 1  10
of
64
Sharp tractability borderlines for finding connected motifs in vertexcolored graphs
 In Proc. 34th Int. Colloquium on Automata, Languages and Programming (ICALP
, 2007
"... Abstract. We study the problem of finding occurrences of motifs in vertexcolored graphs, where a motif is a multiset of colors, and an occurrence of a motif is a subset of connected vertices with a bijection between its colors and the colors of the motif. This problem has applications in metabolic ..."
Abstract

Cited by 20 (10 self)
 Add to MetaCart
Abstract. We study the problem of finding occurrences of motifs in vertexcolored graphs, where a motif is a multiset of colors, and an occurrence of a motif is a subset of connected vertices with a bijection between its colors and the colors of the motif. This problem has applications in metabolic network analysis, an important area in bioinformatics. We give two positive results and three negative results that together draw sharp borderlines between tractable and intractable instances of the problem. 1
Topologyfree querying of protein interaction networks
 In Proceedings of 13th RECOMB
, 2009
"... Abstract. In the network querying problem, one is given a protein complex or pathway of species A and a protein–protein interaction network of species B; the goal is to identify subnetworks of B that are similar to the query. Existing approaches mostly depend on knowledge of the interaction topology ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Abstract. In the network querying problem, one is given a protein complex or pathway of species A and a protein–protein interaction network of species B; the goal is to identify subnetworks of B that are similar to the query. Existing approaches mostly depend on knowledge of the interaction topology of the query in the network of species A; however, in practice, this topology is often not known. To combat this problem, we develop a topologyfree querying algorithm, which we call Torque. Given a query, represented as a set of proteins, Torque seeks a matching set of proteins that are sequencesimilar to the query proteins and span a connected region of the network, while allowing both insertions and deletions. The algorithm uses alternatively dynamic programming and integer linear programming for the search task. We test Torque with queries from yeast, fly, and human, where we compare it to the QNet topologybased approach, and with queries from less studied species, where only topologyfree algorithms apply. Torque detects many more matches than QNet, while in both cases giving results that are highly functionally coherent. 1
Parameterized Algorithms and Hardness Results for Some Graph Motif Problems
"... Abstract. We study the NPcomplete Graph Motif problem: given a vertexcolored graph G = (V, E) and a multiset M of colors, does there exist an S ⊆ V such that G[S] is connected and carries exactly (also with respect to multiplicity) the colors in M? We present an improved randomized algorithm for G ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Abstract. We study the NPcomplete Graph Motif problem: given a vertexcolored graph G = (V, E) and a multiset M of colors, does there exist an S ⊆ V such that G[S] is connected and carries exactly (also with respect to multiplicity) the colors in M? We present an improved randomized algorithm for Graph Motif with running time O(4.32 M  · M  2 · E). We extend our algorithm to listcolored graph vertices and the case where the motif G[S] needs not be connected. By way of contrast, we show that extending the request for motif connectedness to the somewhat “more robust ” motif demands of biconnectedness or bridgeconnectedness leads to W[1]complete problems. Actually, we show that the presumably simpler problems of finding (uncolored) biconnected or bridgeconnected subgraphs are W[1]complete with respect to the subgraph size. Answering an open question from the literature, we further show that the parameter “number of connected motif components ” leads to W[1]hardness even when restricted to graphs that are paths. 1
An integrative approach for causal gene identification and gene regulatory pathway inference
 Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl234 ..."
Balanced families of perfect hash functions and their applications
 Proc. ICALP
, 2007
"... Abstract. The construction of perfect hash functions is a wellstudied topic. In this paper, this concept is generalized with the following definition. We say that a family of functions from [n] to[k] isaδbalanced (n, k)family of perfect hash functions if for every S ⊆ [n], S  = k, the number o ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Abstract. The construction of perfect hash functions is a wellstudied topic. In this paper, this concept is generalized with the following definition. We say that a family of functions from [n] to[k] isaδbalanced (n, k)family of perfect hash functions if for every S ⊆ [n], S  = k, the number of functions that are 11 on S is between T/δ and δT for some constant T>0. The standard definition of a family of perfect hash functions requires that there will be at least one function that is 11 on S,for each S of size k. In the new notion of balanced families, we require the number of 11 functions to be almost the same (taking δ to be close to 1) for every such S. Our main result is that for any constant δ>1, a δbalanced (n, k)family of perfect hash functions of size 2 O(k log log k) log n can be constructed in time 2 O(k log log k) nlog n. Using the technique of colorcoding we can apply our explicit constructions to devise approximation algorithms for various counting problems in graphs. In particular, we exhibit a deterministic polynomial time algorithm for approximating both the number of simple paths of length k and the number of simple log n cycles of size k for any k ≤ O() in a graph with n vertices. The log log log n approximation is up to any fixed desirable relative error.
Weak pattern matching in colored graphs: Minimizing the number of connected components
"... In the context of metabolic network analysis, Lacroix et al. 11 introduced the problem of finding occurrences of motifs in vertexcolored graphs, where a motif is a multiset of colors and an occurrence of a motif is a subset of connected vertices which are colored by all colors of the motif. We cons ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
In the context of metabolic network analysis, Lacroix et al. 11 introduced the problem of finding occurrences of motifs in vertexcolored graphs, where a motif is a multiset of colors and an occurrence of a motif is a subset of connected vertices which are colored by all colors of the motif. We consider in this paper the abovementioned problem in one of its natural optimization forms, referred hereafter as the MinCC problem: Find an occurrence of a motif in a vertexcolored graph, called the target graph, that induces a minimum number of connected components. Our results can be summarized as follows. We prove the MinCC problem to be APX–hard even in the extremal case where the motif is a set and the target graph is a path. We complement this result by giving a polynomialtime algorithm in case the motif is built upon a fixed number of colors and the target graph is a path. Also, extending recent research 8, we prove the MinCC problem to be fixedparameter tractable when parameterized by the size of the motif, and we give a faster algorithm in case the target graph is a tree. Furthermore, we prove the MinCC problem for trees not to be approximable within ratio c log n for some constant c> 0, where n is the order of the target graph, and to be W[2]–hard when parameterized by the number of connected components in the occurrence of the motif. Finally, we give an exact efficient exponentialtime algorithm for the MinCC problem in case the target graph is a tree. 1
NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways
 Nucleic Acids Res
, 2008
"... classes and pathways ..."
Algorithm engineering for colorcoding to facilitate signaling pathway detection
 In Proc. 5th AsiaPacific Bioinformatics Conference (APBC ’07), Advances in Bioinformatics and Computational Biology. World Scientific
, 2007
"... To identify linear signaling pathways, Scott et al. [RECOMB, 2005] recently proposed to extract paths with high interaction probabilities from protein interaction networks. They used an algorithmic technique known as colorcoding to solve this NPhard problem; their implementation is capable of find ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
To identify linear signaling pathways, Scott et al. [RECOMB, 2005] recently proposed to extract paths with high interaction probabilities from protein interaction networks. They used an algorithmic technique known as colorcoding to solve this NPhard problem; their implementation is capable of finding biologically meaningful pathways of length up to 10 proteins within hours. In this work, we give various novel algorithmic improvements for colorcoding, both from a worstcase perspective as well as under practical considerations. Experiments on the interaction networks of yeast and fruit fly as well as a testbed of structurally comparable random networks demonstrate a speedup of the algorithm by orders of magnitude. This allows more complex and larger structures to be identified in reasonable time; finding paths of length up to 13 proteins can even be done in seconds and thus allows for an interactive exploration and evaluation of pathway candidates. 1.
Counting Stars and Other Small Subgraphs in Sublinear Time
"... Detecting and counting the number of copies of certain subgraphs (also known as network motifs or graphlets), is motivated by applications in a variety of areas ranging from Biology to the study of the WorldWideWeb. Several polynomialtime algorithms have been suggested for counting or detecting t ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Detecting and counting the number of copies of certain subgraphs (also known as network motifs or graphlets), is motivated by applications in a variety of areas ranging from Biology to the study of the WorldWideWeb. Several polynomialtime algorithms have been suggested for counting or detecting the number of occurrences of certain network motifs. However, a need for more efficient algorithms arises when the input graph is very large, as is indeed the case in many applications of motif counting. In this paper we design sublineartime algorithms for approximating the number of copies of certain constantsize subgraphs in a graph G. That is, our algorithms do not read the whole graph, but rather query parts of the graph. Specifically, we consider algorithms that may query the degree of any vertex of their choice and may ask for any neighbor of any vertex of their choice. The main focus of this work is on the basic problem of counting the number of length2 paths and more generally on counting the number of stars of a certain size. Specifically, we design an algorithm that, given an approximation parameter 0 < ɛ < 1 and query access to a graph G, outputs an estimate ˆνs such that with high constant probability, (1−ɛ)νs(G) ≤ ˆνs ≤ (1+ɛ)νs(G), where νs(G) denotes the number of stars of size s + 1 in the graph. The expected query ( complexity and { running time of}) the algorithm are O