Results 1 -
8 of
8
Efficiently mining long patterns from databases
, 1998
"... We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data ..."
Abstract
-
Cited by 325 (3 self)
- Add to MetaCart
We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data show that when the patterns are long, our algorithm is more efficient by an order of magnimaximal frequent itemset, Max-Miner’s output implicitly and concisely represents all frequent itemsets. Max-Miner is shown to result in two or more orders of magnitude in performance improvements over Apriori on some data-sets. On other data-sets where the patterns are not so long, the gains are more modest. In practice, Max-Miner is demonstrated to run in time that is roughly linear in the number of maximal frequent itemsets and the size of the database, irrespective of the size of the longest frequent itemset. tude or more. 1.
A Continuous Approach to Inductive Inference
- Mathematical Programming
, 1992
"... In this paper we describe an interior point mathematical programming approach to inductive inference. We list several versions of this problem and study in detail the formulation based on hidden Boolean logic. We consider the problem of identifying a hidden Boolean function F : f0; 1g n ! f0; 1g ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
In this paper we describe an interior point mathematical programming approach to inductive inference. We list several versions of this problem and study in detail the formulation based on hidden Boolean logic. We consider the problem of identifying a hidden Boolean function F : f0; 1g n ! f0; 1g using outputs obtained by applying a limited number of random inputs to the hidden function. Given this input-output sample, we give a method to synthesize a Boolean function that describes the sample. We pose the Boolean Function Synthesis Problem as a particular type of Satisfiability Problem. The Satisfiability Problem is translated into an integer programming feasibility problem, that is solved with an interior point algorithm for integer programming. A similar integer programming implementation has been used in a previous study to solve randomly generated instances of the Satisfiability Problem. In this paper we introduce a new variant of this algorithm, where the Riemannian metric used...
Compilation for Critically Constrained Knowledge Bases
- In Proc. of the 13 th National Conference on Artificial Intelligence (AAAI’96
, 1996
"... We show that many "critically constrained" Random 3SAT knowledge bases (KBs) can be compiled into disjunctive normal form easily by using a variant of the "Davis-Putnam" proof procedure. From these compiled KBs we can answer all queries about entailment of conjunctive normal formulas, also easily -- ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
We show that many "critically constrained" Random 3SAT knowledge bases (KBs) can be compiled into disjunctive normal form easily by using a variant of the "Davis-Putnam" proof procedure. From these compiled KBs we can answer all queries about entailment of conjunctive normal formulas, also easily --- compared to a "bruteforce " approach to approximate knowledge compilation into unit clauses for the same KBs. We exploit this fact to develop an aggressive hybrid approach which attempts to compile a KB exactly until a given resource limit is reached, then falls back to approximate compilation into unit clauses. The resulting approach handles all of the critically constrained Random 3SAT KBs with average savings of an order of magnitude over the brute-force approach. Introduction Consider the task of reasoning from a propositional knowledge base (KB) F which is expressed as a conjunctive normal formula (CNF). We are given other, query CNFs Q 1 ; Q 2 ; : : : ; QN and asked, for each Q i ,...
An SE-tree-based Prime Implicant Generation Algorithm
- IEEE Trans
, 1994
"... Prime implicants/implicates (PIs) have been shown to be a useful tool in several problem domains. In Model-Based Diagnosis (MBD), [de Kleer et al. 90] have used PIs to characterize diagnoses. We present a PI generation algorithm which, although based on the general SE-tree-based search framework, is ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Prime implicants/implicates (PIs) have been shown to be a useful tool in several problem domains. In Model-Based Diagnosis (MBD), [de Kleer et al. 90] have used PIs to characterize diagnoses. We present a PI generation algorithm which, although based on the general SE-tree-based search framework, is effectively an improvement of a particular PI generation algorithm proposed by [Slagle et al. 70]. The improvement is achieved via a decomposition tactic which is boosted by the SE-tree-based framework. The new algorithm is also more flexible in a number of ways. We present empirical results comparing the new algorithm to the old one, as well as to current PI generation algorithms. 1 Introduction Prime implicates/implicants (PIs) were a topic of great interest to researchers in the early days of computer science, in part because of their use in procedures for boolean function minimization [Quine 52]. A number of algorithms were developed, including [Quine 52], [Karnaugh 53], [McCluskey 56]...
Composite Distributive Lattices as Annotation Domains for Mediators
- Proc. of AISC'2000
, 2000
"... In a mediator system based on annotated logics it is a suitable requirement to allow annotations from different lattices in one program on a per-predicate basis. These lattices however may be related through common sublattices, hence demanding predicates which are able to carry combinations of a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In a mediator system based on annotated logics it is a suitable requirement to allow annotations from different lattices in one program on a per-predicate basis. These lattices however may be related through common sublattices, hence demanding predicates which are able to carry combinations of annotations, or access to components of annotations.
BOOM - a Boolean Minimizer
, 2001
"... This report presents an algorithm for two-level Boolean minimization (BOOM) based on a new implicant generation paradigm. In contrast to all previous minimization methods, where the implicants are generated bottom-up, the proposed approach uses a top-down approach. Thus instead of increasing the dim ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This report presents an algorithm for two-level Boolean minimization (BOOM) based on a new implicant generation paradigm. In contrast to all previous minimization methods, where the implicants are generated bottom-up, the proposed approach uses a top-down approach. Thus instead of increasing the dimensionality of implicants by omitting literals from their terms, the dimension of a term is gradually decreased by adding new literals. One of the drawbacks of the classical approach to prime implicant generation, dating back to the original Quine-McCluskey method, is the use of terms (be it minterms or terms of higher dimension) found in the definition of the function to be minimized, as a basis for the solution. Thus the choice of terms used originally for covering the function may influence the final solution. In the proposed method, the original coverage influences the final solution only indirectly, through the number of literals used. Starting from an n-dimensional hypercube (where n is the number of input variables), new terms are generated, whereas only the on-set and off-set are consulted. Thus the original choice of the implicant terms is of a small importance. Most minimization methods use two basic phases introduced by Quine-McCluskey, known as prime implicant generation and the covering problem solution. Some more modern methods, including the well-known ESPRESSO, combine these two phases, reducing the number of implicants to be processed. A sort of combination of prime implicant generation with the solution of the covering problem is also used in the BOOM approach proposed here, because the search for new literals to be included into a term aims at maximum coverage of the output function (coverage-directed search). The implicants generated during the CD-search are then expanded to become primes. Different heuristics are used during the CD-search and when solving the covering problem. The function to be minimized is defined by its on-set and off-set, listed in a truth table. Thus the don't care set, which normally represents the dominant part of the truth table, need not be specified explicitly. The proposed minimization method is efficient above all for functions with several hundreds of input variables and with a large portion of don't care states. The minimization method has been tested on several different kinds of problems. The MCNC standard benchmarks were solved several times in order to evaluate the minimality of the solution and the runtime. Both "easy" and "hard" MCNC benchmarks were solved and compared with the solutions obtained by ESPRESSO. In many cases the time needed to find the minimum solution on an ordinary PC was non-measurable. The procedure is so fast that even for large problems with hundreds of input variables it often finds a solution in a fraction of a second. Hence if the first solution does not meet the requirements, it can be improved in an iterative manner. Larger problems (with more than 100 input variables and more than 100 terms with defined output values) were generated randomly and solved by BOOM and by ESPRESSO. BOOM was in this case up to 166 times faster. For problems with more than 300 input variables no comparison with any other minimization tool was possible, because no other system, including ESPRESSO, can solve such problems. The dimension of the problems solved by BOOM can easily be increased over 1000 input variables, because the runtime grows linearly with the number of inputs. On the other hand, as the runtime grows roughly with the square of the size of the care set, for problems of very high dimension the success largely depends on the number of care terms. The quality of the proposed method was also tested on other problems like graph coloring and symmetric function minimization.
METAPRIME, an Interactive Fault Tree Analyser
, 1994
"... This paper introduces an analysis method of coherent as well as noncoherent fault trees that overcomes this limitation because its computational cost is related to neither the number of basic events, nor the number of gates, nor the number of prime implicants of these trees. We present the concepts ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper introduces an analysis method of coherent as well as noncoherent fault trees that overcomes this limitation because its computational cost is related to neither the number of basic events, nor the number of gates, nor the number of prime implicants of these trees. We present the concepts underlying the prototype tool METAPRIME, and the experimental results obtained with this tool on real life fault trees. These results show that these concepts allow us to completely analyse in seconds fault trees that no previously available technique could ever partially analyse, for instance noncoherent fault trees with more than 10
An Algorithm for Induction of Possibilistic
"... We present a new algorithm, called Optimist, which generates possibilistic setvalued rules from tables containing categorical attributes taking a finite number of values. An example of such a rule might be "IF HOUSEHOLDSIZE={Two OR Tree} AND OCCUPATION={Professional OR Clerical} THEN PAYMENT_METHOD= ..."
Abstract
- Add to MetaCart
We present a new algorithm, called Optimist, which generates possibilistic setvalued rules from tables containing categorical attributes taking a finite number of values. An example of such a rule might be "IF HOUSEHOLDSIZE={Two OR Tree} AND OCCUPATION={Professional OR Clerical} THEN PAYMENT_METHOD={CashCheck (Max=249) OR DebitCard (Max=175)}. The algorithm is based on an original formal framework generalising the conventional boolean approach in two directions: (i) finitevalued variables and (ii) continuos-valued semantics. Using this formalism we approximate the multidimensional distribution induced from data by a number of possibilistic prime disjunctions (patterns) representing the widest intervals of impossible combinations of values. The Optimist algorithm described in the paper generates the most interesting prime disjunctions for one pass through the data set by means of transformation from the DNF representing data into the possibilistic CNF representing knowledge. It consists of generation, absorption and filtration parts. The set-valued rules built from the possibilistic patterns are optimal in the sense that they have the most general condition and the most specific conclusion. For the case of finite-valued attributes and two-valued semantics the algorithm is implemented in the Chelovek rule induction system for Windows 95.

