## Loglinear Models for First-Order Probabilistic Reasoning (1999)

Venue: | In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence |

Citations: | 35 - 5 self |

### BibTeX

@INPROCEEDINGS{Cussens99loglinearmodels,

author = {James Cussens},

title = {Loglinear Models for First-Order Probabilistic Reasoning},

booktitle = {In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence},

year = {1999},

pages = {126--133},

publisher = {Morgan Kaufmann}

}

### Years of Citing Articles

### OpenURL

### Abstract

Recent work on loglinear models in probabilistic constraint logic programming is applied to first-order probabilistic reasoning. Probabilities are defined directly on the proofs of atomic formulae, and by marginalisation on the atomic formulae themselves. We use Stochastic Logic Programs (SLPs) composed of labelled and unlabelled definite clauses to define the proof probabilities. We have a conservative extension of first-order reasoning, so that, for example, there is a one-one mapping between logical and random variables. We show how, in this framework, Inductive Logic Programming (ILP) can be used to induce the features of a loglinear model from data. We also compare the presented framework with other approaches to first-order probabilistic reasoning. Keywords: loglinear models, constraint logic programming, inductive logic programming 1 Introduction A framework which merges first-order logical and probabilistic inference in a theoretically sound and applicable manner promises ma...

### Citations

1913 |
Foundations of Logic Programming
- LLoyd
- 1987
(Show Context)
Citation Context ... future work in Section 8. 2 Logic programming essentials We give a very brief overview of logic programming. For more details, the reader can consult any standard textbook on logic programming, e.g. =-=[13]-=-. In this paper we will consider only definite logic programs. Definite (logic) programs consist of a set of definite clauses, where each definite clause is a disjunctive firstorder formula such as p(... |

1343 |
Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems
- Lauritzen, Spiegelhalter
- 1988
(Show Context)
Citation Context ...SLPs and Markov nets Markov nets (or undirected Bayes nets) are representations of graphical models, a special case of loglinear models. Fig 3 shows the "Asia" Markov net used as a running e=-=xample in [12]-=-, and which is also implemented as an example in the BUGS system [21]. (D) (B) visit to Asia? (A) tubercolosis? (T) (S) either tub. or cancer? (E) positive X-ray? (X) dyspnoea? bronchitis? smoking? lu... |

1090 | Inductive logic programming
- Muggleton
- 1991
(Show Context)
Citation Context ...e the clauses used in a proof are the features defining the probability of that proof, with clause labels denoting the parameters. Eisele [6] examined this approach from an NLP perspective. Muggleton =-=[14]-=- introduced Stochastic Logic Programs, approaching the issue from a general logic programming perspective, with a view to applications in Inductive Logic Programming. In these cases, Stochastic Contex... |

572 | Inducing features of random fields
- Pietra, S
- 1997
(Show Context)
Citation Context ...ence, and the loglinear model can be used to find the most likely parse of any particular sentence. Riezler extends the improved iterative scaling algorithm of Della Pietra, Della Pietra and Lafferty =-=[18]-=- to induce features and parameters for a loglinear model from incomplete data. Incomplete data here consists of just atoms, rather than the proofs of those atoms. In an NLP context this means having a... |

381 |
The estimation of stochastic context-free grammars using the Inside-Outside algorithm’, Computer Speech and Language
- Lari, Young
- 1990
(Show Context)
Citation Context ...hen generating a sentence from a SCFG---a production rule can always be applied to a nonterminal. Because of this a number of techniques (such as the inside-outside algorithm for parameter estimation =-=[11]-=-) can be applied to SCFGs, but cannot be lifted to FGs or LPs. (See Abney [1] for a demonstration of this.) We define a stochastic logic program (SLP) as follows. An SLP is a logic program where some ... |

302 | Context-specific independence in bayesian networks
- Boutilier, Friedman, et al.
- 1996
(Show Context)
Citation Context ...C given particular values of B. This conditional conditional independence 1 or context-specific independence between A and C crop up often in applications and has been investigated by Boutilier et al =-=[4]-=-. To represent context-sensitive independence, we need to be able to differentiate between these two sorts of values of B. Let us assume we have two predicates, strong/1 and weak/1 defined to be mutua... |

145 | Stochastic attribute-value grammars
- Abney
- 1997
(Show Context)
Citation Context ...ere the f i are the features of the distribution, thesi are the model parameters and Z is a normalising constant. 3.1 Probabilistic Constraint Logic Programming In [19], Riezler develops Abney's work =-=[1]-=- defining a loglinear model on the proofs of formulae with some constraint logic program. This requires defining features on these proofs (the f i ) and defining the model parameters (thesi ). The ess... |

136 | Hybrid Probabilistic Logic Programs
- Dekhtyar, Subrahmanian
- 1997
(Show Context)
Citation Context ... do not use a labelled rules: p(X; Y ) / q(X; Y ); r(Y ) to define the probability that some ground atom p(a; b) is true as in KBMC, or to provide bounds on the probability that p(a; b) is true as in =-=[20, 16]-=-. Instead, we have a binary distribution associated with p(X; Y ) which defines the probability of instantiations such as fX=a; Y=bg. In order to reason about the probability of the truth of atoms, we... |

108 | Relational bayesian networks
- Jaeger
- 1997
(Show Context)
Citation Context ... and the query-specific exploration of an SLD-tree by Prolog which deserve further investigation. Another approach to relational probabilistic reasoning are the relational Bayesian networks of Jaeger =-=[8]-=-. Here whole interpretations are the nodes of a Bayesian net. It is conceivable that such networks could be implemented as an SLP, using some suitable object-level representation of an interpretation,... |

98 |
Learning from positive data
- Muggleton
- 1996
(Show Context)
Citation Context ...to define the structural features of our distribution, it is natural to look to ILP for techniques which induce such structural features from data. Work in ILP on learning from positive examples only =-=[15, 3]-=- is of relevance here, but the most thorough incorporation of probabilistic approaches into ILP is by Dehaspe in [5]. Dehaspe presents the MACCENT algorithm which constructs a log-linear model using b... |

95 | Answering queries from contextsensitive probabilistic knowledge bases
- Ngo, Haddawy
- 1996
(Show Context)
Citation Context ...or manipulating tables. The current work seeks to extend that of Wong et al by moving from a relational database setting to the logic programming setting. In Knowledge-based model construction (KBMC) =-=[17, 10, 7]-=- first-order rules with associated probabilities are used to generate Bayesian networks for particular queries. As in SLD-resolution queries are matched to the heads of rules, but in KBMC this results... |

51 |
Logic programs with uncertainties: a tool for implementing rule-based systems
- Shapiro
- 1983
(Show Context)
Citation Context ... do not use a labelled rules: p(X; Y ) / q(X; Y ); r(Y ) to define the probability that some ground atom p(a; b) is true as in KBMC, or to provide bounds on the probability that p(a; b) is true as in =-=[20, 16]-=-. Instead, we have a binary distribution associated with p(X; Y ) which defines the probability of instantiations such as fX=a; Y=bg. In order to reason about the probability of the truth of atoms, we... |

49 | Effective bayesian inference for stochastic programs
- Koller, McAllester, et al.
- 1997
(Show Context)
Citation Context ...rectly, perhaps resorting to quite complex SLPs to model complex interactions between degrees of belief. Finally, SLPs are very closely related to the stochastic (functional) programs of Koller et al =-=[9]-=-. Stochastic execution of the functional program defines a distribution over outputs of the program. As we have done here, Koller et al show how Bayesian nets and SCFGs can be represented in their ric... |

45 | Learning probabilities for noisy first-order rules
- Koller, Pfeffer
- 1997
(Show Context)
Citation Context ...or manipulating tables. The current work seeks to extend that of Wong et al by moving from a relational database setting to the logic programming setting. In Knowledge-based model construction (KBMC) =-=[17, 10, 7]-=- first-order rules with associated probabilities are used to generate Bayesian networks for particular queries. As in SLD-resolution queries are matched to the heads of rules, but in KBMC this results... |

37 | Maximum entropy modeling with clausal constraints
- Dehaspe
- 1997
(Show Context)
Citation Context ...uctural features from data. Work in ILP on learning from positive examples only [15, 3] is of relevance here, but the most thorough incorporation of probabilistic approaches into ILP is by Dehaspe in =-=[5]. Dehaspe -=-presents the MACCENT algorithm which constructs a log-linear model using boolean clausal constraints as features. Dehaspe uses the "learning from interpretations" ILP setting where each exam... |

35 | Query evaluation in probabilistic relational databases
- Zimányi
- 1997
(Show Context)
Citation Context ...g Markov nets. Part of a Prolog implementation of the SLP which represents the "Asia" Markov net can be found in Fig 6. The c/3 predicate looks like a probabilistic database such as those an=-=alysed in [23]-=-, but, for example,sc1(t,t,0.0005) does not signify that the tuples(t; t) belongs to the relation c1=2 with probability 0.0005. This tuple belongs to the relation c1=2 with certainty, we just have a d... |

31 | Probabilistic constraint logic programming
- RIEZLER
- 1999
(Show Context)
Citation Context ...\Gamma1 exp / X isi f i (!) ! (1) where the f i are the features of the distribution, thesi are the model parameters and Z is a normalising constant. 3.1 Probabilistic Constraint Logic Programming In =-=[19]-=-, Riezler develops Abney's work [1] defining a loglinear model on the proofs of formulae with some constraint logic program. This requires defining features on these proofs (the f i ) and defining the... |

29 | A method for implementing a probabilistic model as a relational database
- Wong, Butz, et al.
- 1995
(Show Context)
Citation Context ...e with a few examples of particularly closely related work. This translation of the clique functions of a Markov net to a generalised relational database is essentially the same as that of Wong et al =-=[22]-=-. Wong et al translate many of the graphical operations used with Markov nets to database operations: product distributions are constructed using joins, conditional distributions by projection, and ma... |

26 | Learning mixtures of dag models
- Thiesson, Meek, et al.
- 1998
(Show Context)
Citation Context ...ng(B) / , but the use of negation in SLPs has yet to be properly investigated, hence our current restriction to definite clauses. Mixture models for context-specific independence are investigated in (=-=Thiesson et al., 1997-=-), where learning of such models is considered. (One can view tables defining discrete distributions as in Fig 4, as mixtures of degenerate distributions, but we will not do so.) mixlin(A,B,C) :- stro... |

14 | Predicate invention and learning from positive examples only
- Boström
- 1998
(Show Context)
Citation Context ...to define the structural features of our distribution, it is natural to look to ILP for techniques which induce such structural features from data. Work in ILP on learning from positive examples only =-=[15, 3]-=- is of relevance here, but the most thorough incorporation of probabilistic approaches into ILP is by Dehaspe in [5]. Dehaspe presents the MACCENT algorithm which constructs a log-linear model using b... |

13 |
Towards probabilistic extensions of constraint-based grammars. Contribution to DYANA2 Deliverable R1.2B, DYANA-2 project
- Eisele
- 1994
(Show Context)
Citation Context ... we concentrate on a special case of Riezler's framework, where the clauses used in a proof are the features defining the probability of that proof, with clause labels denoting the parameters. Eisele =-=[6]-=- examined this approach from an NLP perspective. Muggleton [14] introduced Stochastic Logic Programs, approaching the issue from a general logic programming perspective, with a view to applications in... |

5 | Implementing randomised algorithms in constraint logic programming
- Angelopoulos, Pierro, et al.
- 1998
(Show Context)
Citation Context ...ch are likely to mimic algorithms for efficient inference in Koller et al 's stochastic programs. Work on the implementation of randomised algorithms in logic programming is likely to be relevant too =-=[2]-=-. We also expect techniques from logic programming and computational linguistics, such as Earley deduction and program transformation to be useful. For example, when learning the parameters of SLPs, R... |

5 | An overview of some recent developments in Bayesian problem solving techniques
- Haddaway
- 1999
(Show Context)
Citation Context ...or manipulating tables. The current work seeks to extend that of Wong et al by moving from a relational database setting to the logic programming setting. In Knowledge-based model construction (KBMC) =-=[17, 10, 7]-=- first-order rules with associated probabilities are used to generate Bayesian networks for particular queries. As in SLD-resolution queries are matched to the heads of rules, but in KBMC this results... |