Results 1  10
of
117
Identifying the minimal transversals of a hypergraph and related problems
 SIAM Journal on Computing
, 1995
"... The paper considers two decision problems on hypergraphs, hypergraph saturation and recognition of the transversal hypergraph, and discusses their significance for several search problems in applied computer science. Hypergraph saturation, i.e., given a hypergraph H, decide if every subset of vertic ..."
Abstract

Cited by 125 (7 self)
 Add to MetaCart
The paper considers two decision problems on hypergraphs, hypergraph saturation and recognition of the transversal hypergraph, and discusses their significance for several search problems in applied computer science. Hypergraph saturation, i.e., given a hypergraph H, decide if every subset of vertices is contained in or contains some edge of H, is shown to be coNPcomplete. A certain subproblem of hypergraph saturation, the saturation of simple hypergraphs, is shown to be computationally equivalent to transversal hypergraph recognition, i.e., given two hypergraphs H 1; H 2, decide if the sets in H 2 are all the minimal transversals of H 1. The complexity of the search problem related to the recognition of the transversal hypergraph, the computation of the transversal hypergraph, is an open problem. This task needs time exponential in the input size, but it is unknown whether an outputpolynomial algorithm exists for this problem. For several important subcases, for instance if an upper or lower bound is imposed on the edge size or for acyclic hypergraphs, we present outputpolynomial algorithms. Computing or recognizing the minimal transversals of a hypergraph is a frequent problem in practice, which is pointed out by identifying important applications in database theory, Boolean switching theory, logic, and AI, particularly in modelbased diagnosis.
Discovering all Most Specific Sentences by Randomized Algorithms (Extended Abstract)
 In Intl. Conf. on Database Theory
, 1997
"... Dimitrios Gunopulos 1 and Heikki Mannila 2 and Sanjeev Saluja 3 1 MaxPlanckInsitut Informatik, Im Stadtwald, 66123 Saarbrucken, Germany. gunopulo@mpisb.mpg.de 2 University of Helsinki, Dept. of Computer Science, FIN00014 Helsinki, Finland. Heikki.Mannila@cs.helsinki.fi. Work supported by ..."
Abstract

Cited by 55 (5 self)
 Add to MetaCart
Dimitrios Gunopulos 1 and Heikki Mannila 2 and Sanjeev Saluja 3 1 MaxPlanckInsitut Informatik, Im Stadtwald, 66123 Saarbrucken, Germany. gunopulo@mpisb.mpg.de 2 University of Helsinki, Dept. of Computer Science, FIN00014 Helsinki, Finland. Heikki.Mannila@cs.helsinki.fi. Work supported by Alexander von HumboldStiftung and the Academy of Finland. 3 MaxPlanckInstitut Informatik, Im Stadtwald, 66123 Saarbrucken, Germany. saluja@mpisb.mpg.de Abstract. Data mining can in many instances be viewed as the task of computing a representation of a theory of a model or a database. In this paper we present a randomized algorithm that can be used to compute the representation of a theory in terms of the most specific sentences of that theory. In addition to randomization, the algorithm uses a generalization of the concept of hypergraph transversal. We apply the general algorithm, for discovering maximal frequent sets in 0/1 data, and for computing minimal keys in relations. We prese...
Finding and approximating topk answers in keyword proximity search
 In PODS
, 2006
"... Various approaches for keyword proximity search have been implemented in relational databases, XML and the Web. Yet, in all of them, an answer is a Qfragment, namely, a subtree T of the given data graph G, such that T contains all the keywords of the query Q and has no proper subtree with this prop ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
Various approaches for keyword proximity search have been implemented in relational databases, XML and the Web. Yet, in all of them, an answer is a Qfragment, namely, a subtree T of the given data graph G, such that T contains all the keywords of the query Q and has no proper subtree with this property. The rank of an answer is inversely proportional to its weight. Three problems are of interest: finding an optimal (i.e., topranked) answer, computing the topk answers and enumerating all the answers in ranked order. It is shown that, under data complexity, an efficient algorithm for solving the first problem is sufficient for solving the other two problems with polynomial delay. Similarly, an efficient algorithm for finding a θapproximation of the optimal answer suffices for carrying out the following two tasks with polynomial delay, under queryanddata complexity. First, enumerating in a (θ + 1)approximate order. Second, computing a (θ + 1)approximation of the topk answers. As a corollary, this paper gives the first efficient algorithms, under data complexity, for enumerating all the answers in ranked order and for computing the topk answers. It also gives the first efficient algorithms, under queryanddata complexity, for enumerating in a provably approximate order and for computing an approximation of the topk answers.
New Results on Monotone Dualization and Generating Hypergraph Transversals
 SIAM JOURNAL ON COMPUTING
, 2002
"... We consider the problem of dualizing a monotone CNF (equivalently, computing all minimal transversals of a hypergraph), whose associated decision problem is a prominent open problem in NPcompleteness. We present a number of new polynomial time resp. outputpolynomial time results for significant ..."
Abstract

Cited by 37 (12 self)
 Add to MetaCart
We consider the problem of dualizing a monotone CNF (equivalently, computing all minimal transversals of a hypergraph), whose associated decision problem is a prominent open problem in NPcompleteness. We present a number of new polynomial time resp. outputpolynomial time results for significant cases, which largely advance the tractability frontier and improve on previous results. Furthermore, we show that duality of two monotone CNFs can be disproved with limited nondeterminism. More precisely, this is feasible in polynomial time with O(log² n/log log n) suitably guessed bits. This result sheds new light on the complexity of this important problem.
Generating Linear Extensions Fast
"... One of the most important sets associated with a poset P is its set of linear extensions, E(P) . "ExtensionFast.html" 87 lines, 2635 characters One of the most important sets associated with a poset P is its set of linear extensions, E(P) . In this paper, we present an algorithm to generate all of t ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
One of the most important sets associated with a poset P is its set of linear extensions, E(P) . "ExtensionFast.html" 87 lines, 2635 characters One of the most important sets associated with a poset P is its set of linear extensions, E(P) . In this paper, we present an algorithm to generate all of the linear extensions of a poset in constant amortized time; that is, in time O(e(P)) , where e ( P ) =  E(P) . The fastest previously known algorithm for generating the linear extensions of a poset runs in time O(n e(P)) , where n is the number of elements of the poset. Our algorithm is the first constant amortized time algorithm for generating a ``naturally defined'' class of combinatorial objects for which the corresponding counting problem is #Pcomplete. Furthermore, we show that linear extensions can be generated in constant amortized time where each extension differs from its predecessor by one or two adjacent transpositions. The algorithm is practical and can be modified to efficiently count linear extensions, and to compute P(x < y) , for all pairs x,y , in time O( n^2 + e ( P )).
The Inverse Satisfiability Problem
 SIAM Journal on Computing
, 1998
"... We study the complexity of telling whether a set of bitvectors represents the set of all satisfying truth assignments of a Boolean expression of a certain type. We show that the problem is coNPcomplete when the expression is required to be in conjunctive normal form with three literals per clause ..."
Abstract

Cited by 33 (6 self)
 Add to MetaCart
We study the complexity of telling whether a set of bitvectors represents the set of all satisfying truth assignments of a Boolean expression of a certain type. We show that the problem is coNPcomplete when the expression is required to be in conjunctive normal form with three literals per clause (3CNF). We also prove a dichotomy theorem analogous to the classical one by Schaefer, stating that, unless P=NP, the problem can be solved in polynomial time if and only if the clauses allowed are all Horn, or all antiHorn, or all 2CNF, or all equivalent to equations modulo two.
New Algorithms for Enumerating All Maximal Cliques
, 2004
"... Abstract. In this paper, we consider the problems of generating all maximal (bipartite) cliques in a given (bipartite) graph G = (V, E) with n vertices and m edges. We propose two algorithms for enumerating all maximal cliques. One runs with O(M(n)) time delay and in O(n 2) space and the other runs ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
Abstract. In this paper, we consider the problems of generating all maximal (bipartite) cliques in a given (bipartite) graph G = (V, E) with n vertices and m edges. We propose two algorithms for enumerating all maximal cliques. One runs with O(M(n)) time delay and in O(n 2) space and the other runs with O( ∆ 4) time delay and in O(n + m) space, where ∆ denotes the maximum degree of G, M(n) denotes the time needed to multiply two n × n matrices, and the latter one requires O(nm) time as a preprocessing. For a given bipartite graph G, we propose three algorithms for enumerating all maximal bipartite cliques. The first algorithm runs with O(M(n)) time delay and in O(n 2) space, which immediately follows from the algorithm for the nonbipartite case. The second one runs with O( ∆ 3) time delay and in O(n + m) space, and the last one runs with O( ∆ 2) time delay and in O(n + m + N∆) space, where N denotes the number of all maximal bipartite cliques in G and both algorithms require O(nm) time as a preprocessing. Our algorithms improve upon all the existing algorithms, when G is either dense or sparse. Furthermore, computational experiments show that our algorithms for sparse graphs have significantly good performance for graphs which are generated randomly and appear in realworld problems. 1
Interconnection semantics for keyword search in xml
 in XML. CIKM
, 2005
"... A framework for describing semantic relationships among nodes in XML documents is presented. In contrast to earlier work, the XML documents may have ID references (i.e., they correspond to graphs and not just trees). A specific interconnection semantics in this framework can be defined explicitly or ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
A framework for describing semantic relationships among nodes in XML documents is presented. In contrast to earlier work, the XML documents may have ID references (i.e., they correspond to graphs and not just trees). A specific interconnection semantics in this framework can be defined explicitly or derived automatically. The main advantage of interconnection semantics is the ability to pose queries on XML data in the style of keyword search. Several methods for automatically deriving interconnection semantics are presented. The complexity of the evaluation and the satisfiability problems under the derived semantics is analyzed. For many important cases, the complexity is tractable and hence, the proposed interconnection semantics can be efficiently applied to realworld XML documents.
Frequent Subgraph Mining in Outerplanar Graphs
 PROC. 12TH ACM SIGKDD INT. CONF. ON KNOWLEDGE DISCOVERY AND DATA MINING
, 2006
"... In recent years there has been an increased interest in frequent pattern discovery in large databases of graph structured objects. While the frequent connected subgraph mining problem for tree datasets can be solved in incremental polynomial time, it becomes intractable for arbitrary graph databases ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
In recent years there has been an increased interest in frequent pattern discovery in large databases of graph structured objects. While the frequent connected subgraph mining problem for tree datasets can be solved in incremental polynomial time, it becomes intractable for arbitrary graph databases. Existing approaches have therefore resorted to various heuristic strategies and restrictions of the search space, but have not identified a practically relevant tractable graph class beyond trees. In this paper, we consider the class of outerplanar graphs, a strict generalization of trees, develop a frequent subgraph mining algorithm for outerplanar graphs, and show that it works in incremental polynomial time for the practically relevant subclass of wellbehaved outerplanar graphs, i.e., which have only polynomially many simple cycles. We evaluate the algorithm empirically on chemo and bioinformatics applications.
Consensus Algorithms for the Generation of All Maximal Bicliques
, 2002
"... We describe a new algorithm for generating all maximal bicliques (i.e. complete bipartite, not necessarily induced subgraphs) of a graph. The algorithm is inspired by, and is quite similar to, the consensus method used in propositional logic. We show that some variants of the algorithm are totally p ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
We describe a new algorithm for generating all maximal bicliques (i.e. complete bipartite, not necessarily induced subgraphs) of a graph. The algorithm is inspired by, and is quite similar to, the consensus method used in propositional logic. We show that some variants of the algorithm are totally polynomial, and even incrementally polynomial. The total complexity of the most efficient variant of the algorithms presented here is polynomial in the input size, and only linear in the output size. Computational experiments demonstrate its high efficiency on randomly generated graphs with up to 2,000 vertices and 20,000 edges.