## Learning Statistical Models from Relational Data (2001)

Citations: | 41 - 6 self |

### BibTeX

@TECHREPORT{Getoor01learningstatistical,

author = {Lise Getoor and Luc De Raedt},

title = {Learning Statistical Models from Relational Data},

institution = {},

year = {2001}

}

### Years of Citing Articles

### OpenURL

### Abstract

This workshop is the second in a series of workshops held in conjunction with AAAI and IJCAI. The first workshop was held in July, 2000 at AAAI. Notes from that workshop are available at

### Citations

5180 |
C4. 5: programs for machine learning
- Quinlan
- 1993
(Show Context)
Citation Context ...ked movie data [15]. Furthermore, homophily has been observed in human groups with respect to a wide variety of descriptive variables, and is one of the basic premises of theories of social structure =-=[17]-=-. Chakrabarti et al. take advantage of autocorrelation in class values to classify hypertext documents [18]. Their procedure learns a probabilistic model based on the classes of related entities, and ... |

465 | Inductive logic programming: Theory and methods
- Muggleton, Raedt
- 1994
(Show Context)
Citation Context ...ds to excel when (as in Figure 1) entities are more likely to be linked to other entities with the same class membership. This intuitive notion is captured more formally by relational autocorrelation =-=[15]-=-: the correlation between values of the same attribute on linked entities “represents an extremely important type of knowledge about relational data, one that is just beginning to be explored and expl... |

453 | The use of the area under the ROC curve in the evaluation of machine learning algorithms
- Bradley
- 1997
(Show Context)
Citation Context ... For example, one may want to decay the impact of joined edges in the relationship vector as the distance from the node in the graph increases. The weight of an edge in E is therefore defined by [3]. =-=[2]-=- w(E) = αw(e1 ) ⊕ βw(e2) [3] Definition: An entity that takes into consideration relational links of distance greater than one may therefore be defined as the recursive sum of each entity with its fea... |

218 | Cameron-Jones. Foil: A midterm report
- Quinlan, Mike
- 1993
(Show Context)
Citation Context ...y of descriptive variables, and is one of the basic premises of theories of social structure [17]. Chakrabarti et al. take advantage of autocorrelation in class values to classify hypertext documents =-=[18]-=-. Their procedure learns a probabilistic model based on the classes of related entities, and therefore can capture more complex relationships than simply homophily. There are several ways in which the... |

195 | Probabilistic frame-based systems
- Koller, Pfeffer
- 1998
(Show Context)
Citation Context ...itive rate, on the x-axis). The area under the ROC curve (AUC), equivalent to the Wilcoxon-MannWhitney statistic, is the probability that a member of the class will be scored higher than a non-member =-=[9]-=-. Error is calculated as 1 – AUC, and since the AUCs often are close to 1, relative error reduction 2 is reported for comparisons. Figure 2 shows the ROC curves for the best method, the weighted, enti... |

95 | Linkage and autocorrelation cause feature selection bias in relational learning
- Jensen, Neville
- 2002
(Show Context)
Citation Context ...r. Probabilistic models of relational structure. In Proc. ICML01, Williamstown, MA., 2001. [5] J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. of the ACM, 46(5):604–632, 1999. =-=[6]-=- D. Koller and A. Pfeffer. Probabilistic frame-based systems. In Proc. AAAI98, pages 580–587, Madison, Wisc., 1998. [7] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilist... |

75 | Multiple comparisons in induction algorithms. Machine Learning 38(3):309–338
- Jensen, Cohen
- 2000
(Show Context)
Citation Context ...hip. The key contribution of this paper is the presentation and demonstration of a simple, but useful, method for producing classification models from linked data. In analogy to information retrieval =-=[4]-=-, we represent entities using a vector-space model. The relational vector-space (RVS) model abstracts away much of the graph structure, representing entities by adjacency vectors. Various classificati... |

65 | Feature construction with inductive logic programming: A study of quantitative predictions of biological activity aided by structural attributes
- Srinivasan, King
- 1999
(Show Context)
Citation Context ...ed the ILP system FOIL [18] to learn � FOL clauses and appended the corresponding binary features to the feature vector in the target table IPO. This methodology has been applied successfully by King =-=[20]-=- and by Populescul et al. [16] to text classification. ILP: We selected four ILP system based on availability, platform independence and diversity. FOIL [18] uses a top-down, separate-and-conquer stra... |

32 | Data Mining in Social Networks
- Jensen, Neville
(Show Context)
Citation Context ... match sets of entity vectors closest to the query considered similar are ranked and returned. For this exposition, we measure similarity by the cosine distance between the corresponding vector pairs =-=[5]-=-. However, any vector based similarity measure may be considered. The distance measures may be used in i standard hierarchical clustering techniques such as dendrograms (Duda, Hart et al. 2001). r r q... |

28 | Propositionalisation and aggregates
- Knobbe, Haas, et al.
(Show Context)
Citation Context ...8, pages 580–587, Madison, Wisc., 1998. [7] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML01, 2001. =-=[8]-=- Nada Lavra˘c and Saso D˘zeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, New York, New York, 1994. [9] K. P. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief propagat... |

23 |
Distance based approaches to relational learning and clustering
- Kirsten, Wrobel, et al.
- 2001
(Show Context)
Citation Context ...ustries, with similar results (which we also use for illustration). 4.1 ROC Analysis for Sectors Figure 2: ROC curve for weighted, entity-normalized method (averaged over 10 runs) We use ROC analysis =-=[7, 8]-=- to assess the model’s ability to separate class members from non-members. For a given scoring of companies, ROC curves plot all the possible tradeoffs between correctly classifying the members of the... |

23 | Cprogol4.4: A tutorial introduction
- Muggleton, Firth
- 2002
(Show Context)
Citation Context ...cy is achieved. Tilde [1] learns a relational decision tree using FOL clauses in the nodes to split the data. Lime [13] is a top-down ILP system that uses Bayesian criteria to select literals. Progol =-=[14]-=- learns a set of clauses following a bottom-up approach that generalizes the training examples. We did not provide any additional (intensional) backgound knowledge beyond the facts in the database. We... |

23 | Towards structural logistic regression: Combining relational and statistical learning
- Popescul, Ungar, et al.
- 2002
(Show Context)
Citation Context ... learn � FOL clauses and appended the corresponding binary features to the feature vector in the target table IPO. This methodology has been applied successfully by King [20] and by Populescul et al. =-=[16]-=- to text classification. ILP: We selected four ILP system based on availability, platform independence and diversity. FOIL [18] uses a top-down, separate-and-conquer strategy adding literals to the or... |

20 | Stochastic propositionalization of non-determinate background knowledge
- Kramer, Pfahringer, et al.
- 1998
(Show Context)
Citation Context ...� Customer(X,Y,Z), RichCustomer(x)� P� Transaction(X,V,P,W), 100 The prediction of an ILP model is positive if at least one of the clauses is true for the particular case. Binary propositionalization =-=[19]-=-,[10] also learns sets of (first-order) clauses, but rather than using them directly for prediction it constructs binary features that are given as input to a traditional learning method (e.g., decisi... |

17 |
Blockeel and Luc De Raedt. Top-Down Induction of FirstOrder Logical Decision Trees
- Hendrik
- 1998
(Show Context)
Citation Context ...ubiquitous, and relational data mining is receiving increasing attention with the explicit linking of web sites, and with the need to analyze social networks for applications such as counterterrorism =-=[1, 2, 3]-=-. We address a particular relational data mining application: identifying the group membership of linked entities. We Scott Clearwater Clearwater Ways P.O. Box 620513 Woodside, CA 94062, U. S. A. clea... |

16 | Three companions for data mining in first order logic
- Raedt, Blockeel, et al.
- 2001
(Show Context)
Citation Context ... vector-space model. In the financial literature and industry, companies are clustered into industry groupings based on correlations in their financial time series (and singular-value decompositions) =-=[11]-=-. Our experiments so far with these methods have not yielded remarkable performance on our classification tasks. Probabilistic and statistically oriented relational learning methods, such as PRMs [12]... |

7 | Induction in first order logic from noisy training samples and fixed sample sizes
- McCreath
- 1999
(Show Context)
Citation Context ...ds have not yielded remarkable performance on our classification tasks. Probabilistic and statistically oriented relational learning methods, such as PRMs [12], and relational versions of naïve Bayes =-=[13]-=-, decision trees [14], etc., hold the most promise for competing with the RVS model. These methods do perform aggregations over the values of the attributes at linked nodes. In particular, properly ut... |

6 | Dimensionality reduction in ILP: A call to arms
- Fürnkranz
- 1997
(Show Context)
Citation Context ...teresting patterns are possible. Thus, in the relational setting, the issue of feature construction is critical. It is therefore important to explore the problem of automatic feature induction, as in =-=[3]-=-. Finally, we believe that this framework can provide a principled approach for addressing a wide range of applications, including predicting communities of people and hierarchical structure of people... |

1 |
On the generalized distance in satistics
- Mahalanobis
- 1936
(Show Context)
Citation Context ...ved variables � ������� � � � � � ��� : is the partition function, now dependent on � . ��� � ����� � ���¦������� � � � � � , where ��� � Relational Markov Networks. A relational Markov network (RMN) =-=[12]-=- specifies the cliques and potentials between attributes of related entities at a template level, so a singlesmodel provides a coherent distribution for any collection of instances from the schema. RM... |