## Statistical predicate invention (2007)

### Cached

### Download Links

- [www.cs.washington.edu]
- [www.cs.washington.edu]
- [alchemy.cs.washington.edu]
- [ai.cs.washington.edu]
- [www.machinelearning.org]
- [homes.cs.washington.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Z. Ghahramani (Ed.), Proceedings of the 24’th annual international conference on machine learning (ICML-2007 |

Citations: | 35 - 10 self |

### BibTeX

@INPROCEEDINGS{Kok07statisticalpredicate,

author = {Stanley Kok and Pedro Domingos},

title = {Statistical predicate invention},

booktitle = {In Z. Ghahramani (Ed.), Proceedings of the 24’th annual international conference on machine learning (ICML-2007},

year = {2007},

pages = {433--440}

}

### OpenURL

### Abstract

We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an initial model for SPI based on second-order Markov logic, in which predicates as well as arguments can be variables, and the domain of discourse is not fully known in advance. Our approach iteratively refines clusters of symbols based on the clusters of symbols they appear in atoms with (e.g., it clusters relations by the clusters of the objects they relate). Since different clusterings are better for predicting different subsets of the atoms, we allow multiple cross-cutting clusterings. We show that this approach outperforms Markov logic structure learning and the recently introduced infinite relational model on a number of relational datasets. 1.

### Citations

7054 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...tension of first-order logic. A Markov logic network (MLN) is a set of weighted first-order formulas. Together with a set of constants representing objects in the domain, it defines a Markov network (=-=Pearl, 1988-=-) with one node per ground atom and one feature per ground formula. The weight of a feature is the weight of the first-order formula that originated it. The probability distribution over possible worl... |

565 | Markov logic networks
- Richardson, Domingos
- 2006
(Show Context)
Citation Context ...n be of more than one type, and the relations can take on any number of arguments. Xu et al. (2006) propose a closely related model. In this paper, we present MRC, an algorithm based on Markov logic (=-=Richardson & Domingos, 2006-=-), as a first step towards a general framework for SPI. MRC 1 To our knowledge, this is the only previous paper that uses the term ‘statistical predicate invention’. Statistical Predicate Invention au... |

251 | Information-theoretic co-clustering - Dhillon, Mallela, et al. |

236 | Improving text classification by shrinkage in a hierarchy of classes
- McCallum, Rosenfeld, et al.
- 1998
(Show Context)
Citation Context ...e finest clusterings produced. In other words, if we view MRC as growing a tree of clusterings, it returns the leaves. Conceivably, it might be useful to retain the whole tree, and perform shrinkage (=-=McCallum et al., 1998-=-) over it. This is an item for future work. Notice that the clusters created at a higher level of recursion constrain the clusters that can be created at lower levels, e.g., if two symbols are assigne... |

211 |
Tutorial on statistical relational learning
- Getoor
(Show Context)
Citation Context ...f statistical learning and relational learning (also known as inductive logic programming), and developed several novel representations, as well as algorithms to learn their parameters and structure (=-=Getoor & Taskar, 2007-=-). However, the problem of statistical predicate invention (SPI) has so far received little attention in the community. SPI is the discovery of new concepts, properties and relations from data, expres... |

179 |
Machine invention of first-order predicates by inverting resolution
- Muggleton, Buntine
- 1988
(Show Context)
Citation Context ...n be invented by analyzing first-order formulas, and forming a predicate to represent either their commonalities (interconstruction (Wogulis & Langley, 1989)) or their differences (intraconstruction (=-=Muggleton & Buntine, 1988-=-)). A weakness of inter/intraconstruction is that they are prone to over-generating predicates, many of which are not useful. Predicates can also be invented by instantiating second-order templates (S... |

137 | Learning systems of concepts with an infinite relational model
- Kemp, Tenenbaum, et al.
- 2006
(Show Context)
Citation Context ... symbols of different arities and argument types are never clustered together. This is a limitation that we plan to overcome in the future. 4. Experiments In our experiments, we compare MRC with IRM (=-=Kemp et al., 2006-=-) and MLN structure learning (Kok & Domingos, 2005). 4.1. Infinite Relational Model The IRM is a recently-published model that also clusters objects, attributes, and relations. However, unlike MRC, it... |

121 | Combinatorial stochastic processes - Pitman - 2006 |

102 |
The Alchemy system for statistical relational AI
- Kok, Sumner, et al.
- 2008
(Show Context)
Citation Context ...ts CRP and Beta parameters. 4.2. MLN Structure Learning We also compare MRC to Kok and Domingos’ (2005) MLN structure learning algorithm (MSL, beam search version) implemented in the Alchemy package (=-=Kok et al., 2006-=-). MSL begins by creating all possible unit clauses. Then, at each step, it creates candidate clauses by adding literals to the current clauses. The weight of each candidate clause is learned by optim... |

95 |
Sound and efficient inference with probabilistic and deterministic dependencies
- Poon, Domingos
(Show Context)
Citation Context ...g MSL. Since in each run these are only 10% of the training set, setting them to false does not greatly change the sufficient statistics (true clause counts) learning is based on. We then ran MC-SAT (=-=Poon & Domingos, 2006-=-) on the MLNs learned by MSL to infer the probabilities of the test atoms. 4 To evaluate the performance of MRC, IRM and MSL, we measured the average conditional log-likelihood of the test atoms given... |

87 | Learning the structure of Markov logic networks
- Kok, Domingos
(Show Context)
Citation Context ...s are never clustered together. This is a limitation that we plan to overcome in the future. 4. Experiments In our experiments, we compare MRC with IRM (Kemp et al., 2006) and MLN structure learning (=-=Kok & Domingos, 2005-=-). 4.1. Infinite Relational Model The IRM is a recently-published model that also clusters objects, attributes, and relations. However, unlike MRC, it only finds a single clustering. It defines a gene... |

68 | Relational learning with statistical predicate invention: Better models for hypertext - Craven, Slattery |

58 | D (2005) Leveraging relational autocorrelation with latent group models
- Neville, Jensen
(Show Context)
Citation Context ...ndle noisy data. Only a few approaches to date combine elements of statistical and relational learning. Most of them only cluster objects, not relations (Popescul & Ungar, 2004; Wolfe & Jensen, 2004; =-=Neville & Jensen, 2005-=-; Xu et al., 2005; Long et al., 2006; Roy et al., 2006). Craven and Slattery (2001) proposed a learning mechanism for hypertext domains in which class predictions produced by naive Bayes are added to ... |

55 | M.: Relational clichés: Constraining constructive induction during relational learning
- Silverstein, Pazzani
- 1991
(Show Context)
Citation Context ...8)). A weakness of inter/intraconstruction is that they are prone to over-generating predicates, many of which are not useful. Predicates can also be invented by instantiating second-order templates (=-=Silverstein & Pazzani, 1991-=-), or to represent exceptions to learned rules (Srinivasan et al., 1992). Relational predicate invention approaches suffer from a limited ability to handle noisy data. Only a few approaches to date co... |

46 | Infinite Hidden Relational Models - Xu, Tresp, et al. - 2006 |

43 | Spectral clustering for multi-type relational data
- Long, Zhang, et al.
- 2006
(Show Context)
Citation Context ... date combine elements of statistical and relational learning. Most of them only cluster objects, not relations (Popescul & Ungar, 2004; Wolfe & Jensen, 2004; Neville & Jensen, 2005; Xu et al., 2005; =-=Long et al., 2006-=-; Roy et al., 2006). Craven and Slattery (2001) proposed a learning mechanism for hypertext domains in which class predictions produced by naive Bayes are added to an ILP system (FOIL) as invented pre... |

40 | Discovering hidden variables: A structurebased approach - Elidan, Lotner, et al. - 2000 |

30 |
Default Probability
- Osherson, Stern, et al.
- 1991
(Show Context)
Citation Context ...ne of the candidates improves WPLL. 4.3. Datasets We compared MRC to IRM and MSL on all four datasets used in Kemp et al. (2006). 3 Animals. This dataset contains a set of animals and their features (=-=Osherson et al., 1991-=-). It consists exclusively of unary predicates of the form f(a), where f is a feature and a is an animal (e.g., Swims(Dolphin)). There are 50 animals, 85 features, and thus a total of 4250 ground atom... |

24 | Cluster-Based Concept Invention for Statistical Relational Learning
- Popescul, Ungar
- 2004
(Show Context)
Citation Context ...approaches suffer from a limited ability to handle noisy data. Only a few approaches to date combine elements of statistical and relational learning. Most of them only cluster objects, not relations (=-=Popescul & Ungar, 2004-=-; Wolfe & Jensen, 2004; Neville & Jensen, 2005; Xu et al., 2005; Long et al., 2006; Roy et al., 2006). Craven and Slattery (2001) proposed a learning mechanism for hypertext domains in which class pre... |

24 | Distinguishing exceptions from noise in non-monotonic Learning
- Muggleton, Bain
- 1992
(Show Context)
Citation Context ...enerating predicates, many of which are not useful. Predicates can also be invented by instantiating second-order templates (Silverstein & Pazzani, 1991), or to represent exceptions to learned rules (=-=Srinivasan et al., 1992-=-). Relational predicate invention approaches suffer from a limited ability to handle noisy data. Only a few approaches to date combine elements of statistical and relational learning. Most of them onl... |

23 | Learning hidden variable networks: The information bottleneck approach - Elidan, Friedman - 2005 |

23 | An upper level ontology for the biomedical domain
- McCray
(Show Context)
Citation Context ...o-clustering, and has received considerable attention in the recent literature (e.g., Dhillon et al. (2003)). UMLS. UMLS contains data from the Unified Medical Language System, a biomedical ontology (=-=McCray, 2003-=-). It consists of binary predicates of the form r(c, c ′ ), where c and c ′ are biomedical concepts (e.g., Antibiotic, Disease), and r is a relation between them (e.g., Treats, Diagnoses). There are 4... |

21 | Learning annotated hierarchies from relational data
- Roy, Kemp, et al.
- 2007
(Show Context)
Citation Context ...nts of statistical and relational learning. Most of them only cluster objects, not relations (Popescul & Ungar, 2004; Wolfe & Jensen, 2004; Neville & Jensen, 2005; Xu et al., 2005; Long et al., 2006; =-=Roy et al., 2006-=-). Craven and Slattery (2001) proposed a learning mechanism for hypertext domains in which class predictions produced by naive Bayes are added to an ILP system (FOIL) as invented predicates. 1 The SAY... |

14 | Change of representation for statistical relational learning
- Davis, Ong, et al.
- 2007
(Show Context)
Citation Context ...attery (2001) proposed a learning mechanism for hypertext domains in which class predictions produced by naive Bayes are added to an ILP system (FOIL) as invented predicates. 1 The SAYU-VISTA system (=-=Davis et al., 2007-=-) uses an off-the-shelf ILP system (Aleph) to learn Horn clauses on a database. It creates a predicate for each clause learned, adds it as a relational table to the database, and then runs a standard ... |

13 |
Improving efficiency by learning intermediate concepts
- Wogulis, Langley
- 1989
(Show Context)
Citation Context ...g employs several techniques for predicate invention. Predicates can be invented by analyzing first-order formulas, and forming a predicate to represent either their commonalities (interconstruction (=-=Wogulis & Langley, 1989-=-)) or their differences (intraconstruction (Muggleton & Buntine, 1988)). A weakness of inter/intraconstruction is that they are prone to over-generating predicates, many of which are not useful. Predi... |

12 |
Playing multiple roles: Discovering overlapping roles in social networks
- Identify, Wolfe, et al.
- 2004
(Show Context)
Citation Context ... limited ability to handle noisy data. Only a few approaches to date combine elements of statistical and relational learning. Most of them only cluster objects, not relations (Popescul & Ungar, 2004; =-=Wolfe & Jensen, 2004-=-; Neville & Jensen, 2005; Xu et al., 2005; Long et al., 2006; Roy et al., 2006). Craven and Slattery (2001) proposed a learning mechanism for hypertext domains in which class predictions produced by n... |

10 |
The detection of patterns in Alyawarra nonverbal behavior
- Denham
- 1973
(Show Context)
Citation Context ...ations and 135 concepts, for a total of 893,025 ground atoms, of which 6529 are true. Kinship. This dataset contains kinship relationships among members of the Alyawarra tribe from Central Australia (=-=Denham, 1973-=-). Predicates are of the form k(p, p ′ ), where k is a kinship relation and p, p ′ are persons. There are 26 kinship terms and 104 persons, for a total of 281,216 ground atoms, of which 10,686 are tru... |

6 |
Dimensionality of nations project: Attributes of nations and behavior of nation dyads
- Rummel
- 1950
(Show Context)
Citation Context ...persons. There are 26 kinship terms and 104 persons, for a total of 281,216 ground atoms, of which 10,686 are true. Nations. This dataset contains a set of relations among nations and their features (=-=Rummel, 1999-=-). It consists of binary and unary predicates. The binary predicates are of the form r(n, n ′ ), where n, n ′ are nations, and r is a relation between them (e.g., ExportsTo, GivesEconomicAidTo). The u... |

2 | Predicate invention: A comprehensive view (Technical Report - Kramer - 1995 |

1 |
Statistical Predicate Invention
- Kemp, Tenenbaum, et al.
- 2006
(Show Context)
Citation Context ... symbols of different arities and argument types are never clustered together. This is a limitation that we plan to overcome in the future. 4. Experiments In our experiments, we compare MRC with IRM (=-=Kemp et al., 2006-=-) and MLN structure learning (Kok & Domingos, 2005). 4.1. Infinite Relational Model The IRM is a recently-published model that also clusters objects, attributes, and relations. However, unlike MRC, it... |