## Dyna: Extending Datalog For Modern AI ⋆

### Cached

### Download Links

Citations: | 4 - 0 self |

### BibTeX

@MISC{Eisner_dyna:extending,

author = {Jason Eisner and Nathaniel W. Filardo},

title = {Dyna: Extending Datalog For Modern AI ⋆},

year = {}

}

### OpenURL

### Abstract

Abstract. Modern statistical AI systems are quite large and complex; this interferes with research, development, and education. We point out that most of the computation involves database-like queries and updates on complex views of the data. Specifically, recursive queries look up and aggregate relevant or potentially relevant values. If the results of these queries are memoized for reuse, the memos may need to be updated through change propagation. We propose a declarative language, which generalizes Datalog, to support this work in a generic way. Through examples, we show that a broad spectrum of AIalgorithms can be concisely captured by writing down systems of equations in our notation. Many strategies could be used to actually solve those systems. Our examples motivatecertainextensionstoDatalog, whichareconnectedtofunctional and object-oriented programming paradigms. 1 Why a New Data-Oriented Language for AI? Modern AI systems are frustratingly big, making them time-consuming to engineer

### Citations

7347 | Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference - Pearl - 1988 |

2854 | an Electronic Lexical Database - Fellbaum - 1998 |

2469 | F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data - Lafferty, McCallum, et al. - 2001 |

2319 | Support Vector Networks
- Cortes, Vapnik
- 1995
(Show Context)
Citation Context ...twork topology is typically a sparse graph (Figure 1). Sparse products are very common in AI. For example, sparse dot products are used both in computing similarity and in linear or log-linear models =-=[15]-=-. A dot product like score(Structure) += weight(Feature)*strength(Feature,Structure) 21 These names are not items but appear in the rule as unevaluated terms. However, the expressions X+I and Y+J are ... |

1828 | Core Team. R: A Language and Environment for Statistical Computing - Development - 2006 |

1558 | A note on two problems in connexion with graphs - Dijkstra - 1959 |

1523 | The Stable Model Semantics for Logic Programming
- Gelfond, Lifschitz
- 1988
(Show Context)
Citation Context ...ferent from usual practice in the logic programming community (see [54] for a review and synthesis), which when it permits non-stratified programsat all, typically identifies their semantics with one =-=[29]-=- or more [44] “stable models” or the intersection thereof [63,37], although in general the stable models are computationally intractable to find. A simple example of a non-stratified program (with at ... |

1498 | WordNet: A Lexical Database for English - Miller - 1995 |

1166 | Chaff: Engineering an efficient SAT solver - Moskewicz, Madigan, et al. - 2001 |

1081 | Learning stochastic logic programs - Muggleton |

973 |
Negation as failure
- Clark
- 1978
(Show Context)
Citation Context ... this makes it Turing-complete, so we cannot guarantee that Dyna programs will terminate. That is the programmer’s responsibility. 6 This language design choice naturally extends completion semantics =-=[12]-=-. One can still force a default 0 by adding the explicit rule sibling(A,B) += 0 to (8). See the full version of this paper [22] for further discussion. 7 See the full version of this paper [22] for mo... |

863 | The well-founded semantics for general logic programs
- Gelder, Ross, et al.
- 1991
(Show Context)
Citation Context ...(see [54] for a review and synthesis), which when it permits non-stratified programsat all, typically identifies their semantics with one [29] or more [44] “stable models” or the intersection thereof =-=[63,37]-=-, although in general the stable models are computationally intractable to find. A simple example of a non-stratified program (with at most one supported model [58]) is single-source shortest paths, 1... |

814 | Adaptive mixture of local experts - Jacobs, Jordan, et al. - 1991 |

725 | The semantics of predicate logic as a programming language
- Emden, Kowalski
- 1976
(Show Context)
Citation Context ...A model (or interpretation) of a logic program P is a partial map �·� from items to values. A supported model [4] is a fixpoint of the “immediate consequence” operator TP associated with that program =-=[62]-=-. In our setting, this means that for each item α, the value �α� (according to the model) equals the value that would be computed for α (given the program rules defining α from other items and the val... |

721 | CYC: a large-scale investment in knowledge infrastructure - Lenat - 1995 |

655 | Synchronous Tree Adjoining Grammars - Shieber, Schabes - 1990 |

626 |
Towards a theory of declarative knowledge
- Apt, Blair, et al.
- 1988
(Show Context)
Citation Context ...;0.75). sibling(A,B;sum(Ma*Mb)) :- parent(C,A;Ma), parent(C,B;Mb). (7) Datalog dialects with aggregation (or negation) often impose a further requirement to ensure that the relations are well-defined =-=[4,49]-=-: – Stratification: A relation that is defined using aggregation (or negation) must not be defined in terms of itself. This prevents cyclic systems of equations that have no consistent solution (e.g.,... |

601 | Markov logic network
- Richardson, Domingos
- 2006
(Show Context)
Citation Context ...t and non-monotonic reasoning [6], via := rules like those in Figure 5. A related important use of default patterns in AI is “lifted inference” [61] in probabilisticsettings like MarkovLogic Networks =-=[57]-=-, where additional (non-default) computation is necessary only for individuals about whom additional (non-default) facts are known. Yet anotheruse in AI is default arcs of various kinds in determinis... |

584 |
Constraint Processing
- Dechter
(Show Context)
Citation Context ...ar) |= possible(Var:Val). % Var has a possible value consistent &= non_empty(Var) whenever is_var(Var). % each Var in the system has a possible value Fig.6: Arc consistency for constraint programming =-=[19]-=-. The goal is to rule out some impossible values for some variables, using a collection ofunary constraints (in_domain) and binary constraints (compatible) that are given by the problem and/or tested ... |

572 | Self: The power of simplicity - Ungar, Smith - 1987 |

568 | The Syntactic Process - Steedman - 2000 |

551 |
A machine program for theorem-proving
- Davis, Logemann, et al.
- 1962
(Show Context)
Citation Context ... number of grandchildren the child needs to probe. The recursion terminates when all variables are constrained. One good execution strategy for this Dyna program would resemble the actual DPLL method =-=[18]-=-, with – a reasonable variable ordering strategy to select nextvar; – eachchilddynabasecreatedbyatemporarymodificationoftheparent,which is subsequently undone; – running arc consistency at a node to c... |

525 | Lexical-Functional Grammar: A Formal System for Grammatical Representation - Kaplan, Bresnan - 1982 |

446 | Labelme: A database and web-based tool for image annotation - Russell, Torralba, et al. - 2008 |

429 | A learning algorithm for continually running fully recurrent neural networks
- Williams, Zipser
- 1989
(Show Context)
Citation Context ...ltaneous equations, often by iterating to convergence. In fact, the neural network program of Figure 2 already requires iteration to convergence in the case of a cyclic (“recurrent”) network topology =-=[64]-=-. Such iterative algorithms are often known as “message passing” algorithms. They can be regarded as negotiating a stable configuration of the items’ values. Updates to one item trigger updates to rel... |

413 | Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling
- Finkel, Grenager, et al.
- 2005
(Show Context)
Citation Context ...ses can be propagated forward through a pipeline (joint prediction) and gradients can be propagated backward (joint training). Althoughthisisgenerallyunderstoodinthenaturallanguageprocessingcommunity =-=[28]-=-, it is surprisingly rare for papers to actually implement joint prediction or joint training, because of the extra design and engineering effort, particularly when integrating non-trivial modules by ... |

404 | Hierarchical phrase-based translation - Chiang - 2007 |

385 | A hierarchical phrase-based model for statistical Machine Translation - Chiang - 2005 |

368 | Principles of data mining - Hand, Mannila, et al. - 2001 |

320 | Why and Where: A Characterization of Data Provenance - Buneman, Khanna, et al. - 2001 |

318 |
Understanding belief propagation and its generalizations
- Yedidia, Freeman, et al.
(Show Context)
Citation Context ...n one or more assignments Asst in which this is the case. message(Con, Var:Val) += belief(Con:Asst) / message(Var:Val, Con) whenever Asst.Var == Val. Fig.7: Loopy belief propagation on a factor graph =-=[66]-=-. The constraints together define a Markov Random Field joint probability distribution over the variables. We seek to approximate the marginals of that distribution: at each variable Var we will deduc... |

314 |
Logic Programming and Databases
- Ceri, Gottlob, et al.
- 1990
(Show Context)
Citation Context ... and extensional data. This is the focus of §2, beginning with a review of ordinary Datalog in §2.1. A program in our Dyna language specifies what we call a dynabase. Recall that a deductive database =-=[11,56]-=- contains not only extensional relations but also rules (usually Datalog rules or some other variant on Horn clauses) that define additional intensional relations, similar to views. Our term “dynabase... |

301 | Probabilistic Horn abduction and Bayesian networks - Poole - 1993 |

298 | Europarl: A Parallel Corpus for Statistical Machine Translation - Koehn |

296 | The Penn treebank: Annotating predicate argument structure - Marcus, Kim, et al. - 1994 |

287 | Maintenance of Materialized Views: Problems, Techniques, and Applications
- Gupta, Mumick
- 1995
(Show Context)
Citation Context ...n AI Recall that dynabases implement dynamic algorithms: their intensional items update automatically in response to changes in their extensional input. This correspondsto“viewmaintenance”in databases=-=[34]-=-, andto“self-adjustingcomputation” [1] in functional languages. 41 Additional details may be found in a section of the full version of this paper [22].We observe that this kind of change propagation ... |

277 | The essence of compiling with continuations - Flanagan, Sabry, et al. - 1993 |

251 | Stable models and an alternative logic programming paradigm. The Journal of Logic Programming
- Marek, Truszczyfiski
- 1999
(Show Context)
Citation Context ...sual practice in the logic programming community (see [54] for a review and synthesis), which when it permits non-stratified programsat all, typically identifies their semantics with one [29] or more =-=[44]-=- “stable models” or the intersection thereof [63,37], although in general the stable models are computationally intractable to find. A simple example of a non-stratified program (with at most one supp... |

239 | EuroWordNet: A Multilingual Database with Lexical Semantic Networks - Vossen - 1998 |

222 | OLD resolution with tabulation - Tamaki, Sato - 1986 |

212 | Functional reactive animation
- Elliott, Hudak
- 1997
(Show Context)
Citation Context ...ss intelligence (e.g., LogicBlox [41]); stream processing for algorithmicequities trading (e.g., DBToaster [2]); user interfaces (e.g., Dynasty [24] and Fruit [16]); declarativeanimation (e.g., Fran =-=[25]-=-); query plannersand optimizers (see the discussion in the full paper); and even (incremental) compilers [9]. In an AI system—for example, medical decision support—sensors may continously gather infor... |

202 | S.: DBpedia – A crystallization point for the Web of Data - Bizer, Lehmann, et al. - 2009 |

187 | An overview of krl: A knowledge representation language - Bobrow, Winograd |

176 | Building Large Knowledge-Based Systems: Representation and Inference - Lenat, Guha - 1990 |

169 | Principles and implementation of deductive parsing
- Shieber, Schabes, et al.
- 1995
(Show Context)
Citation Context ... express how a parse tree is recursively built up by combining adjacent phrases into larger phrases, under the guidance of a grammar. The forward-chainingalgorithmof§2.6hereyields“agenda-basedparsing”=-=[60]-=-: when a recently built or updated phrase pops off the agenda into the chart, it tries to combine with adjacent phrases in the chart. We will return to this example in §3.2. Meanwhile, the reader is e... |

162 |
Recognition and parsing of context-free languages in time n3. Information and Control, 10(2):189{208. A Full Code for the Deductive Parsing Engine hListing of file infer.pli /*========================================================== Parser Based on a Ge
- Younger
- 1967
(Show Context)
Citation Context ...ediate values: fib(N) := fib(N-1) + fib(N-2). % general rule fib(0) := 1. % exceptions for base cases fib(1) := 1. (18) As a basic AI example, consider context-free parsing with a CKY-style algorithm =-=[67]-=-. The Dyna program in Figure 8 consists of 3 rules that directly% Belief at each variable based on the messages it receives from constraints. belief(Var:Val) *= message(Con, Var:Val). % Belief at eac... |

150 | Products of experts
- Hinton
- 1999
(Show Context)
Citation Context ...requires the ability for components to query one another or pass messages to one another [28]. Similarly, one may wish to combine the strengths of diverse AI systems that are attempting the same task =-=[35]-=-. A recently emerging theme, therefore, is the development of principled methods for coordinating the work of multiple combinatorial algorithms. See references in the full version of this paper [22]. ... |

146 | Incremental parsing with the perceptron algorithm - Collins, Roark - 2004 |

142 | What you Always Wanted to Know About Datalog (And Never Dared to Ask
- Ceri, Gottlob, et al.
- 1989
(Show Context)
Citation Context ...xing certain restrictions; and by introducing useful notions of encapsulation and inheritance. (Formal semantics are outlined in an appendix to the full version [22].) 2.1 Background: Datalog Datalog =-=[10]-=- is a language—a concrete syntax—for defining named, flat relations. The (slightly incorrect) statement “Two people are siblings if they share a parent” can be precisely captured by a rule such as sib... |

137 | Provenance semirings
- Green, Karvounarakis, et al.
- 2007
(Show Context)
Citation Context ...nction (over the subgoals of a proof), and the aggregator :- denotes boolean disjunction (over possible proofs). Thus, true and null effectively form a 2-valued logic. Semiring-weightedDatalogprograms=-=[30,23,31]-=-correspondtoruleslike(8)where + and * denote the operations of a semiring. 2.4 Restoring Expressivity Although our motivation comes from deductive databases, Dyna relaxes the restrictions that Datalog... |