## The Independent Choice Logic and Beyond

### Cached

### Download Links

- [www.cs.ubc.ca]
- [www.cs.ubc.ca]
- [cs.ubc.ca]
- [people.cs.ubc.ca]
- DBLP

### Other Repositories/Bibliography

Citations: | 18 - 5 self |

### BibTeX

@MISC{Poole_theindependent,

author = {David Poole},

title = {The Independent Choice Logic and Beyond},

year = {}

}

### OpenURL

### Abstract

Abstract. The Independent Choice Logic began in the early 90’s as a way to combine logic programming and probability into a coherent framework. The idea of the Independent Choice Logic is straightforward: there is a set of independent choices with a probability distribution over each choice, and a logic program that gives the consequences of the choices. There is a measure over possible worlds that is defined by the probabilities of the independent choices, and what is true in each possible world is given by choices made in that world and the logic program. ICL is interesting because it is a simple, natural and expressive representation of rich probabilistic models. This paper gives an overview of the work done over the last decade and half, and points towards the considerable work ahead, particularly in the areas of lifted inference and the problems of existence and identity. 1

### Citations

1659 |
The Foundations of Statistics
- Savage
- 1954
(Show Context)
Citation Context ...as needing (at least) the firstorder predicate calculus. There are also good normative reasons for using Bayesian decision theory for decision making under uncertainty [Neumann and Morgenstern, 1953; =-=Savage, 1972-=-]. These arguments can be intuitively interpreted as seeing decision making as a form of gambling, and that probability and utility are the appropriate calculi for gambling. These arguments lead to th... |

1492 | The stable model semantics for logic programming
- Gelfond, Lifschitz
- 1988
(Show Context)
Citation Context ... a virtue. These multiple models can correspond to multiple ways the world can be. Baral et al. [2004] have investigated having probability distributions over answer sets. The stable model semantics [=-=Gelfond and Lifschitz, 1988-=-] provides a semantics for logic programs where the clauses contain negations in the body (i.e., for “negation as failure”). The stable model semantics is particularly simple with acyclic logic progra... |

1117 |
Causality: Models, Reasoning, and Inference
- Pearl
- 2000
(Show Context)
Citation Context ...interpretation of logic programs gives another way to look at ICL. It turns out that any belief network can be represented as a deterministic system with (independent) probabilistic exogenous inputs [=-=Pearl, 2000-=-, p. 30]. One technique for making a probabilistic programming language is to use a standard programming language to define the deterministic system and to allow for random inputs. This is the basis f... |

1055 | Inductive logic programming
- Muggleton
- 1991
(Show Context)
Citation Context ...a general-purpose solver and expect it to work (although [Chavira et al., 2006] comes close to this goal) as there is much more structure in the high-level representations. Stochastic logic programs [=-=Muggleton, 1996-=-] are quite different in their goal to the other frameworks presented here. Stochastic logic programs give probability distributions over proofs, rather than defining the probability that some proposi... |

960 |
Negation as failure
- Clark
- 1978
(Show Context)
Citation Context ...symbols, the grounding contains countably infinitely many clauses. Logic programs are important because they have: – a logical interpretation in terms of truth values of clauses (or their completion [=-=Clark, 1978-=-]). A logic program is a logical sentence from which one can ask for logical consequences. – a procedural semantics (or fixed-point semantics). A logic program is a non-deterministic pattern-matching ... |

854 | A tutorial on learning with bayesian networks
- Heckerman
- 1995
(Show Context)
Citation Context ...g decision trees and neural networks, as well us unsupervised learning.sFig. 3. A belief network and plate representation from Example 7 – Learning the structure and probabilities of belief networks [=-=Heckerman, 1995-=-]. There has been much work on learning parameters for the related system called PRISM [Sato and Kameya, 2001]. There are a number of reasons that the ICL makes a good target language for learning: – ... |

627 | Inverse entailment and progol
- Muggleton
- 1995
(Show Context)
Citation Context ...s a good target language for learning: – Being based on logic programming, it can build on the successes of inductive logic programming [Muggleton and De Raedt, 1994; Quinlan and Cameron-Jones, 1995; =-=Muggleton, 1995-=-]. The fact that parts of ICL theories are logic programs should aid in this effort. – There is much local structure that naturally can be expressed in the ICL that can be exploited. One of the most s... |

565 | Markov logic networks
- Richardson, Domingos
- 2006
(Show Context)
Citation Context ... the structure can be learned effectively. Note that the first two properties are not true of undirected models such as Markov networks (see e.g., Pearl [1988], pages 107–108). Markov Logic Networks [=-=Richardson and Domingos, 2006-=-] inherit all of the problems of undirected models. Example 1. Consider the problem of diagnosing errors that students make on simple multi-digit addition problems [Brown and Burton, 1978]: x2 x1 + y2... |

557 |
Principles of artificial intelligence
- Nilsson
- 1982
(Show Context)
Citation Context ...derable work ahead, particularly in the areas of lifted inference and the problems of existence and identity. 1 Introduction There are good normative arguments for using logic to represent knowledge [=-=Nilsson, 1991-=-; Poole et al., 1998]. These arguments are usually based on reasoning with symbols with an explicit denotation, allowing relations amongst individuals, and permitting quantification over individuals. ... |

510 | Learning probabilistic relational models
- Getoor, Friedman, et al.
- 2001
(Show Context)
Citation Context ...ing [Friedman and Goldszmidt, 1996; Chickering et al., 1997]. These decision trees correspond to a particular form of ICL rules. – Unlike many representations such as Probabilistic Relational Models [=-=Getoor et al., 2001-=-], the ICL is not restricted to a fixed number of parameters to learn; it is possible to have representations where each individual has associated parameters. This should allow for richer representati... |

294 | Probabilistic Horn abduction and Bayesian networks - Poole - 1993 |

288 | Context-specific independence in Bayesian networks
- Boutilier, Friedman, et al.
- 1996
(Show Context)
Citation Context ... don’t want a fully parametrized atomic choice, and you often want to say what happens when the atomic choice is false. The ICL representation lets us naturally specify context-specific independence [=-=Boutilier et al., 1996-=-; Poole, 1997a], where, for example, a may be independent of c when b is false but be dependent when b is true. Context-specific independence is often specified in terms of a tree for each variable; t... |

247 | Operations for learning with graphical models
- Buntine
- 1994
(Show Context)
Citation Context ...mes (and change their skills through time). One way to represent this is to duplicate nodes for the different digits, problems, students and times. The resulting network can be depicted using plates [=-=Buntine, 1994-=-], as in Figure 2. The way to view this representation is that there are copies of the variables Digit Problem x y Student Time knows carry carry knows addition Fig. 2. A belief network with plates ad... |

234 | Learning bayesian networks with local structure
- Friedman, Goldszmidt
- 1998
(Show Context)
Citation Context ...expressed in the ICL that can be exploited. One of the most successful methods for learning Bayesian networks is to learn a decision tree for each variable given its predecessors in a total ordering [=-=Friedman and Goldszmidt, 1996-=-; Chickering et al., 1997]. These decision trees correspond to a particular form of ICL rules. – Unlike many representations such as Probabilistic Relational Models [Getoor et al., 2001], the ICL is n... |

172 |
M.: Acyclic programs
- Apt, Bezem
- 1991
(Show Context)
Citation Context ...ogical statements, procedurally or as a database and query language. Logic programming research has gone in two general directions. In the first, are those frameworks, such as acyclic logic programs [=-=Apt and Bezem, 1991-=-], that ensure there is a single model for any logic program. Acyclic logic programs assume that all recursions for variable-free queries eventually halt. In particular, a program is acyclic if there ... |

166 | A Bayesian approach to learning Bayesian networks with local structure
- Chickering, Heckerman, et al.
- 1997
(Show Context)
Citation Context ...e exploited. One of the most successful methods for learning Bayesian networks is to learn a decision tree for each variable given its predecessors in a total ordering [Friedman and Goldszmidt, 1996; =-=Chickering et al., 1997-=-]. These decision trees correspond to a particular form of ICL rules. – Unlike many representations such as Probabilistic Relational Models [Getoor et al., 2001], the ICL is not restricted to a fixed ... |

153 | Identity uncertainty and citation matching - Pasula, Marthi, et al. - 2003 |

150 | The Independent Choice Logic for modelling multiple agents under uncertainty - Poole - 1997 |

144 |
R.: Diagnostic models for procedural bugs in basic mathematical skills
- Brown, Burton
- 1978
(Show Context)
Citation Context ...orks [Richardson and Domingos, 2006] inherit all of the problems of undirected models. Example 1. Consider the problem of diagnosing errors that students make on simple multi-digit addition problems [=-=Brown and Burton, 1978-=-]: x2 x1 + y2 y1 z3 z2 z1 The students are presented with the digits x1, x2, y1 and y2 and are expected to provide the digits z1, z2 and z3. From observing their behaviour, we want to infer whether th... |

141 |
Recursive conditioning
- Darwiche
- 2001
(Show Context)
Citation Context ...If we generated all of the explanations we could compute the probabilities exactly, but there are combinatorially many explanations. It should be possible to combine this with recursive conditioning [=-=Darwiche, 2001-=-] to get the best of both worlds. – Stochastic simulation; generating the needed atomic choices stochastically, and estimating the probabilities by counting the resulting proportions. One of the thing... |

140 | Prediction is deduction but explanation is abduction
- Shanahan
- 1989
(Show Context)
Citation Context ...xplain all of the observations and see what these explanations also predict. This is similar to proposals in the non-monotonic reasoning community to mix abduction and default reasoning [Poole, 1989; =-=Shanahan, 1989-=-; Poole, 1990]. We can also bound the prior and posterior probabilities by generating only a few of the most plausible explanations (either top-down [Poole, 1993a] or bottom-up [Poole, 1996]). Thus we... |

136 | Answer set programming and plan generation - LIFSCHITZ |

129 | Blog: Probabilistic models with unknown objects
- Milch, Marthi, et al.
- 2005
(Show Context)
Citation Context ...t depends on the previous time. By observing x, y and z for a student on various problems, we can query on the probability the student knows addition and knows how to carry. 2.6 Unknown Objects BLOG [=-=Milch et al., 2005-=-] claims to deal with unknown objects. In this section we will show how to write one of the BLOG example in ICL. First note that BLOG has many built-in procedures, and ICL (as presented) has none. I w... |

128 |
First-order probabilistic inference
- Poole
(Show Context)
Citation Context ...ication. There have been a number of attempts at doing this for various simple languages [Poole, 1997a, 2003; de Salvo Braz et al., 2005], but the final solution remains elusive. The general idea of [=-=Poole, 2003-=-] is that we can do lifted reasoning as in theorem proving or as in Prolog, using unification for matching, but instead of applying a substitution such as {X/c}, we need to split on X = c, giving the ... |

99 | Computational Intelligence: A Logical Approach - Poole, Mackworth, et al. - 1998 |

91 | 2001. Parameter learning of logic programs for symbolicstatistical modeling - Sato, Kameya |

90 |
Theory of Games and Economic Behavior
- Neumann, Morgenstern
- 1944
(Show Context)
Citation Context ...uals. This is often translated as needing (at least) the firstorder predicate calculus. There are also good normative reasons for using Bayesian decision theory for decision making under uncertainty [=-=Neumann and Morgenstern, 1953-=-; Savage, 1972]. These arguments can be intuitively interpreted as seeing decision making as a form of gambling, and that probability and utility are the appropriate calculi for gambling. These argume... |

83 | State Abstraction for Programmable Reinforcement Learning Agents
- Andre, Russell
- 2002
(Show Context)
Citation Context ...ic logic programs (they can even have negation as failure) to specify the deterministic system – IBAL [Pfeffer, 2001] uses an ML-like functional language to specify the deterministic system – A-Lisp [=-=Andre and Russell, 2002-=-] uses Lisp to specify the deterministic system – CES [Thrun, 2000] uses C++ to specify the deterministic system. While each of these have their advantages, the main advantage if ICL is the declarativ... |

63 | Probabilistic reasoning with answer sets - Baral, Gelfond, et al. - 2004 |

61 | Induction of logic programs: Foil and related systems - Quinlan, Cameron-Jones - 1995 |

58 | IBAL: A probabilistic rational programming language - Pfeffer - 2001 |

58 | A methodology for using a default and abductive reasoning system
- Poole
- 1993
(Show Context)
Citation Context ...us time. The plate notation is very convenient and natural for many problems and leads to what could be called parametrized belief networks that are networks that are built from templates [Horsch and =-=Poole, 1990-=-]. Note that it is difficult to use plates when one variable depends on different instances of the same relation. For example, if whether two authors collaborate depends on whether they have coauthore... |

54 | Compiling relational Bayesian networks for exact inference
- Chavira, Darwiche, et al.
- 2006
(Show Context)
Citation Context ...ng and Poole, 1996; Poole, 1996; Poole and Zhang, 2003]. We should not assume that we can just pass of the probabilistic inference problem to a general-purpose solver and expect it to work (although [=-=Chavira et al., 2006-=-] comes close to this goal) as there is much more structure in the high-level representations. Stochastic logic programs [Muggleton, 1996] are quite different in their goal to the other frameworks pre... |

46 | Probabilistic models for relational data - Heckerman, Meek, et al. - 2004 |

39 | Logic programming, abduction and probability - a top-down anytime algorithm for estimating prior and posterior probabilities
- Poole
- 1993
(Show Context)
Citation Context ...th data). The complete axiomatization is available at the CILog2 web site (Footnote 2). Note that the CILog implementation can solve this as it generates only some of the proofs and bounds the error [=-=Poole, 1993-=-a] (although not as efficiently as possible; see Section 3). BLOG lets you build libraries of distributions in Java. ICL, lets you build them in (pure) Prolog. The main difference is that a pure Prolo... |

39 | Abducing through negation as failure: stable models within the independent choice logic - Poole - 2000 |

37 | Probabilistic partial evaluation: Exploiting rule structure in probabilistic inference
- Poole
- 1997
(Show Context)
Citation Context ...t choices and a logic program to give the consequences of the choices. The independent choice logic extends probabilistic Horn abduction in allowing for multiple agents each making their own choicess[=-=Poole, 1997-=-b] (where nature is a special agent who makes choices probabilistically) and in allowing negation as failure in the logic [Poole, 2000b]. The ICL is still one of the simplest and most powerful represe... |

23 | Probabilistic conflicts in a search algorithm for estimating posterior probabilities in Bayesian networks
- Poole
- 1996
(Show Context)
Citation Context .... It is straightforward to translate these into the ICL. The logic program also naturally represents the “noisy or”, when the bodies are not disjoint which is a form of causal independence [Zhang and =-=Poole, 1996-=-]. Standard algorithms such as clique-tree propagation are not good at reasoning with these representations, but there are ways to exploit noisy-or and context specific independence using modification... |

22 | Towards programming tools for robots that integrate probabilistic computation and learning
- Thrun
- 2000
(Show Context)
Citation Context ...rministic system – IBAL [Pfeffer, 2001] uses an ML-like functional language to specify the deterministic system – A-Lisp [Andre and Russell, 2002] uses Lisp to specify the deterministic system – CES [=-=Thrun, 2000-=-] uses C++ to specify the deterministic system. While each of these have their advantages, the main advantage if ICL is the declarative semantics and the relational view (it is also an extension of Da... |

21 | Efficient computation for the noisy MAX
- Díez, Galán
- 2003
(Show Context)
Citation Context ...tion are not good at reasoning with these representations, but there are ways to exploit noisy-or and context specific independence using modifications of variable elimination [Zhang and Poole, 1996; =-=Díez and Galán, 2002-=-; Poole and Zhang, 2003] or recursive conditioning [Allen and Darwiche, 2003]. Example 5. Continuing our addition example, it is difficult to specify a Bayesian network even for the simple case of add... |

20 | Representing diagnostic knowledge for probabilistic Horn abduction (pp
- Poole
- 1991
(Show Context)
Citation Context ...n be done in a simple, straightforward manner is the motivation behind a large body of research over the last 20 years. The independent choice logic (ICL) started off as Probabilistic Horn Abduction [=-=Poole, 1991-=-a,b, 1993a,b] (the first three of these papers had a slightly different language), which allowed for probabilistically independent choices and a logic program to give the consequences of the choices. ... |

19 | A dynamic approach to probabilistic inference using Bayesian networks
- Horsch, Poole
- 1990
(Show Context)
Citation Context ... the previous time. The plate notation is very convenient and natural for many problems and leads to what could be called parametrized belief networks that are networks that are built from templates [=-=Horsch and Poole, 1990-=-]. Note that it is difficult to use plates when one variable depends on different instances of the same relation. For example, if whether two authors collaborate depends on whether they have coauthore... |

16 | Inference in hybrid Bayesian networks with mixtures truncated exponentials
- Cobb, Shenoy
- 2006
(Show Context)
Citation Context ...fine continuous variables in terms of a mixture of kernel functions, such as mixtures of Gaussian distributions, truncated Gaussians [Cozman and Krotkov, 1994] or truncated exponential distributions [=-=Cobba and Shenoy, 2006-=-]. This can be done by having Gaussian alternatives. Allowing Gaussian alternatives and conditions in the logic programs, means that the program has to deal with truncated Gaussians; but it also means... |

14 | Representing Bayesian networks within probabilistic Horn abduction
- Poole
- 1991
(Show Context)
Citation Context ...n be done in a simple, straightforward manner is the motivation behind a large body of research over the last 20 years. The independent choice logic (ICL) started off as Probabilistic Horn Abduction [=-=Poole, 1991-=-a,b, 1993a,b] (the first three of these papers had a slightly different language), which allowed for probabilistically independent choices and a logic program to give the consequences of the choices. ... |

13 | Logical generative models for probabilistic reasoning about existence, roles and identity
- Poole
(Show Context)
Citation Context ...room, but you haven’t observed its size, and a large bedroom, but you haven’t observed its colour. It isn’t well defined (or obvious) how to condition on the observation of c. A solution proposed in [=-=Poole, 2007-=-] is to only have probabilities over well-defined propositions, and for the theory to only refer to closed formulae; this avoids the need to do correspondence between objects in the model and individu... |

12 | Of Klingons and Starships: Bayesian Logic for the 23rd Century - Laskey, Costa - 2005 |

9 | Truncated Gaussians as Tolerance Sets
- Cozman, Krotkov
- 1997
(Show Context)
Citation Context ...partitions is not computationally satisfactory. It is better to define continuous variables in terms of a mixture of kernel functions, such as mixtures of Gaussian distributions, truncated Gaussians [=-=Cozman and Krotkov, 1994-=-] or truncated exponential distributions [Cobba and Shenoy, 2006]. This can be done by having Gaussian alternatives. Allowing Gaussian alternatives and conditions in the logic programs, means that the... |

4 | Bayesian probability, graphical models, and abduction - Learning - 1998 |