## Learning relational probability trees (2003)

### Cached

### Download Links

- [www.cc.gatech.edu]
- [www.cc.gatech.edu]
- [www.cc.gatech.edu]
- [kdl.cs.umass.edu]
- [kdl.cs.umass.edu]
- [www.cs.purdue.edu]
- [www.cs.umass.edu]
- [www.cs.cornell.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |

Citations: | 119 - 33 self |

### BibTeX

@INPROCEEDINGS{Neville03learningrelational,

author = {Jennifer Neville and David Jensen and Lisa Friedland and Michael Hay},

title = {Learning relational probability trees},

booktitle = {In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},

year = {2003},

pages = {625--630}

}

### Years of Citing Articles

### OpenURL

### Abstract

Classification trees are widely used in the machine learning and data mining communities for modeling propositional data. Recent work has extended this basic paradigm to probability estimation trees. Traditional tree learning algorithms assume that instances in the training data are homogenous and independently distributed. Relational probability trees (RPTs) extend standard probability estimation trees to a relational setting in which data instances are heterogeneous and interdependent. Our algorithm for learning the structure and parameters of an RPT searches over a space of relational features that use aggregation functions (e.g. AVERAGE, MODE, COUNT) to dynamically propositionalize relational data and create binary splits within the RPT. Previous work has identified a number of statistical biases due to characteristics of relational data such as autocorrelation and degree disparity. The RPT algorithm uses a novel form of randomization test to adjust for these biases. On a variety of relational learning tasks, RPTs built using randomization tests are significantly smaller than other models and achieve equivalent, or better, performance. 1.

### Citations

5438 |
C4.5: Programs for Machine Learning
- Quinlan
- 1993
(Show Context)
Citation Context ...sis task and present an abbreviated version of an RPT model learned for this task. We outline the details of the RPT algorithm and finish with an experimental section that evaluates RPTs against C4.5 =-=[15]-=- and relational Bayes classifiers [13]. 2. EXAMPLE TASK Recent research has examined methods for constructing statistical models of complex relational data [4]. Examples of such data include social ne... |

533 | Learning probabilistic relational models
- Friedman, Getoor, et al.
- 1999
(Show Context)
Citation Context ...ference. Aggregation is widely used as a means to “propositionalize” relational data for modeling, applied either as a pre-processing step (e.g. [11]) or dynamically during the learning process (e.g. =-=[5]-=-). Heterogeneous data instances are transformed into homogenous records by aggregating multiple values into a single value (e.g. average actor age). Conventional machine learning techniques are then e... |

362 | Learning to Extract Symbolic Knowledge from the World Wide Web
- Craven, DiPasquo, et al.
- 1998
(Show Context)
Citation Context ...4]. Examples of such data include social networks, genomic data, and data on interrelated people, places, things, and events extracted from text documents. The data set collected by the WebKB Project =-=[2]-=-sconsists of a set of web pages from four computer science departments. The web pages have been manually classified into categories: course, faculty, staff, student, research project, or other. The ca... |

124 |
Randomization Tests
- Edgington
- 1980
(Show Context)
Citation Context ...ndomization tests to account for bias and variance in feature scores due to linkage, autocorrelation and degree disparity. A randomization test is a type of computationally intensive statistical test =-=[3]-=-, which involves generating many replicates of an actual data set—typically called pseudosamples—and to estimate a sampling distribution. Pseudosamples are generated by randomly reordering (permuting)... |

96 | Linkage and autocorrelation cause feature selection bias in relational learning
- Jensen, Neville
- 2002
(Show Context)
Citation Context ...stics of relational data. Our recent work has concentrated on the challenges of learning probabilistic models in relational data, where the traditional assumption of instance independence is violated =-=[7, 8]-=-. We have identified three characteristics of relational data—concentrated linkage, degree disparity, and relational autocorrelation—and have shown how they can complicate efforts to construct good st... |

85 |
Top-down induction of firstorder logical decision trees
- Blockeel, Raedt
- 1998
(Show Context)
Citation Context ...pairs and aggregation functions, but the basic algorithm structure remains relatively unchanged. Several other decision tree algorithms for relational data have already been developed including TILDE =-=[1]-=-, Multi Relational Decision Trees [9] and Structural Regression Trees (SRTs) [10]. These systems focus on extending decision tree algorithms to work in the first-order logic framework used by inductiv... |

83 | A machine learning approach to building domainspecific search engines
- McCallum, Nigam, et al.
- 1999
(Show Context)
Citation Context ... year of a producer's first film. The third task used a data set drawn from Cora, a database of computer science research papers extracted automatically from the web using machine learning techniques =-=[12]-=-. We selected the set of 1,511 machine-learning papers that had at least one author, reference and journal. In addition to these papers, the collection contains all associated authors, references, and... |

82 | Multiple comparisons in induction algorithms
- Jensen, Cohen
- 1999
(Show Context)
Citation Context ...as resulted in a large body of research detailing the results of various algorithm design choices. For example, it has been shown that cross-validation can be used to avoid attribute selection biases =-=[6]-=- and that split criteria are generally insensitive to misclassification costs [14]. Recent work has extended the basic classification tree paradigm to probability estimation trees and has focused on i... |

64 | Structural Regression Trees
- Kramer
- 1996
(Show Context)
Citation Context ...ively unchanged. Several other decision tree algorithms for relational data have already been developed including TILDE [1], Multi Relational Decision Trees [9] and Structural Regression Trees (SRTs) =-=[10]-=-. These systems focus on extending decision tree algorithms to work in the first-order logic framework used by inductive logic programming systems (ILP). Although these systems can be used to build cl... |

64 | Simple estimators for relational bayesian classifiers
- Neville, Jensen, et al.
- 2003
(Show Context)
Citation Context ...rsion of an RPT model learned for this task. We outline the details of the RPT algorithm and finish with an experimental section that evaluates RPTs against C4.5 [15] and relational Bayes classifiers =-=[13]-=-. 2. EXAMPLE TASK Recent research has examined methods for constructing statistical models of complex relational data [4]. Examples of such data include social networks, genomic data, and data on inte... |

41 | Well-Trained PETs: Improving Probability Estimation Trees. CeDER Working Paper #IS-00-04
- Provost, Domingos
- 2000
(Show Context)
Citation Context ...m design choices. For example, it has been shown that cross-validation can be used to avoid attribute selection biases [6] and that split criteria are generally insensitive to misclassification costs =-=[14]-=-. Recent work has extended the basic classification tree paradigm to probability estimation trees and has focused on improving probability estimates in leaves [14]. We can Permission to make digital o... |

25 | Avoiding bias when aggregating relational data with degree disparity
- Jensen, Neville, et al.
- 2003
(Show Context)
Citation Context ...stics of relational data. Our recent work has concentrated on the challenges of learning probabilistic models in relational data, where the traditional assumption of instance independence is violated =-=[7, 8]-=-. We have identified three characteristics of relational data—concentrated linkage, degree disparity, and relational autocorrelation—and have shown how they can complicate efforts to construct good st... |

22 | Multi-relational Decision Tree Induction
- Knobbe, Siebes, et al.
- 1999
(Show Context)
Citation Context ...the basic algorithm structure remains relatively unchanged. Several other decision tree algorithms for relational data have already been developed including TILDE [1], Multi Relational Decision Trees =-=[9]-=- and Structural Regression Trees (SRTs) [10]. These systems focus on extending decision tree algorithms to work in the first-order logic framework used by inductive logic programming systems (ILP). Al... |

20 | Helma C.: Stochastic Propositionalization of Non- Determinate Background Knowledge
- Kramer, Pfahringer
- 1998
(Show Context)
Citation Context ..., heterogeneous data instances for both learning and inference. Aggregation is widely used as a means to “propositionalize” relational data for modeling, applied either as a pre-processing step (e.g. =-=[11]-=-) or dynamically during the learning process (e.g. [5]). Heterogeneous data instances are transformed into homogenous records by aggregating multiple values into a single value (e.g. average actor age... |