Results 1  10
of
33
StatSnowball: a Statistical Approach to Extracting Entity Relationships
 WWW 2009 MADRID! TRACK: DATA MINING / SESSION: STATISTICAL METHODS
, 2009
"... Traditional relation extraction methods require prespecified relations and relationspecific humantagged examples. Bootstrapping systems significantly reduce the number of training examples, but they usually apply heuristicbased methods to combine a set of strict hard rules, which limit the abili ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
Traditional relation extraction methods require prespecified relations and relationspecific humantagged examples. Bootstrapping systems significantly reduce the number of training examples, but they usually apply heuristicbased methods to combine a set of strict hard rules, which limit the ability to generalize and thus generate a low recall. Furthermore, existing bootstrapping methods do not perform open information extraction (Open IE), which can identify various types of relations without requiring prespecifications. In this paper, we propose a statistical extraction framework called Statistical Snowball (StatSnowball), which is a bootstrapping system and can perform both traditional relation extraction and Open IE. StatSnowball uses the discriminative Markov logic networks
Learning Markov logic network structure via hypergraph lifting
 In Proceedings of the 26th International Conference on Machine Learning (ICML09
, 2009
"... Markov logic networks (MLNs) combine logic and probability by attaching weights to firstorder clauses, and viewing these as templates for features of Markov networks. Learning MLN structure from a relational database involves learning the clauses and weights. The stateoftheart MLN structure lear ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
Markov logic networks (MLNs) combine logic and probability by attaching weights to firstorder clauses, and viewing these as templates for features of Markov networks. Learning MLN structure from a relational database involves learning the clauses and weights. The stateoftheart MLN structure learners all involve some element of greedily generating candidate clauses, and are susceptible to local optima. To address this problem, we present an approach that directly utilizes the data in constructing candidates. A relational database can be viewed as a hypergraph with constants as nodes and relations as hyperedges. We find paths of true ground atoms in the hypergraph that are connected via their arguments. To make this tractable (there are exponentially many paths in the hypergraph), we lift the hypergraph by jointly clustering the constants to form higherlevel concepts, and find paths in it. We variabilize the ground atoms in each path, and use them to form clauses, which are evaluated using a pseudolikelihood measure. In our experiments on three realworld datasets, we find that our algorithm outperforms the stateoftheart approaches. 1.
Extracting Semantic Networks from Text Via Relational Clustering
"... Abstract. Extracting knowledge from text has long been a goal of AI. Initial approaches were purely logical and brittle. More recently, the availability of large quantities of text on the Web has led to the development of machine learning approaches. However, to date these have mainly extracted grou ..."
Abstract

Cited by 24 (7 self)
 Add to MetaCart
Abstract. Extracting knowledge from text has long been a goal of AI. Initial approaches were purely logical and brittle. More recently, the availability of large quantities of text on the Web has led to the development of machine learning approaches. However, to date these have mainly extracted ground facts, as opposed to general knowledge. Other learning approaches can extract logical forms, but require supervision and do not scale. In this paper we present an unsupervised approach to extracting semantic networks from large volumes of text. We use the TextRunner system [1] to extract tuples from text, and then induce general concepts and relations from them by jointly clustering the objects and relational strings in the tuples. Our approach is defined in Markov logic using four simple rules. Experiments on a dataset of two million tuples show that it outperforms three other relational clustering approaches, and extracts meaningful semantic networks. 1
Deep transfer via secondorder markov logic
 In Proceedings of the AAAI Workshop on Transfer Learning For Complex Tasks
, 2008
"... Standard inductive learning requires that training and test instances come from the same distribution. Transfer learning seeks to remove this restriction. In shallow transfer, test instances are from the same domain, but have a different distribution. In deep transfer, test instances are from a diff ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
Standard inductive learning requires that training and test instances come from the same distribution. Transfer learning seeks to remove this restriction. In shallow transfer, test instances are from the same domain, but have a different distribution. In deep transfer, test instances are from a different domain entirely (i.e., described by different predicates). Humans routinely perform deep transfer, but few learning systems, if any, are capable of it. In this paper we propose an approach based on a form of secondorder Markov logic. Our algorithm discovers structural regularities in the source domain in the form of Markov logic formulas with predicate variables, and instantiates these formulas with predicates from the target domain. Using this approach, we have successfully transferred learned knowledge among molecular biology, social network and Web domains. The discovered patterns include broadly useful properties of predicates, like symmetry and transitivity, and relations among predicates, such as various forms of homophily. 1.
Unsupervised methods for determining object and relation synonyms on the web
 Journal of Artificial Intelligence Research
, 2009
"... The task of identifying synonymous relations and objects, or synonym resolution, is critical for highquality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither handtagged training examples nor domain knowledge is ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
The task of identifying synonymous relations and objects, or synonym resolution, is critical for highquality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither handtagged training examples nor domain knowledge is available. The paper presents a scalable, fullyimplemented system that runs in O(KN log N) time in the number of extractions, N, and the maximum number of synonyms per word, K. The system, called Resolver, introduces a probabilistic relational model for predicting whether two strings are coreferential based on the similarity of the assertions containing them. On a set of two million assertions extracted from the Web, Resolver resolves objects with 78 % precision and 68 % recall, and resolves relations with 90 % precision and 35 % recall. Several variations of Resolver’s probabilistic model are explored, and experiments demonstrate that under appropriate conditions these variations can improve F1 by 5%. An extension to the basic Resolver system allows it to handle polysemous names with 97 % precision and 95 % recall on a data set from the TREC corpus.
Modelling Relational Data using Bayesian Clustered Tensor Factorization
"... We consider the problem of learning probabilistic models for complex relational structures between various types of objects. A model can help us “understand ” a dataset of relational facts in at least two ways, by finding interpretable structure in the data, and by supporting predictions, or inferen ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
We consider the problem of learning probabilistic models for complex relational structures between various types of objects. A model can help us “understand ” a dataset of relational facts in at least two ways, by finding interpretable structure in the data, and by supporting predictions, or inferences about whether particular unobserved relations are likely to be true. Often there is a tradeoff between these two aims: clusterbased models yield more easily interpretable representations, while factorizationbased approaches have given better predictive performance on large data sets. We introduce the Bayesian Clustered Tensor Factorization (BCTF) model, which embeds a factorized representation of relations in a nonparametric Bayesian clustering framework. Inference is fully Bayesian but scales well to large data sets. The model simultaneously discovers interpretable clusters and yields predictive performance that matches or beats previous probabilistic models for relational data. 1
A ThreeWay Model for Collective Learning on MultiRelational Data
"... Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a threeway tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the laten ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a threeway tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorization. We substantiate our theoretical considerations regarding the collective learning capabilities of our model by the means of experiments on both a new dataset and a dataset commonly used in entity resolution. Furthermore, we show on common benchmark datasets that our approach achieves better or onpar results, if compared to current stateoftheart relational learning solutions, while it is significantly faster to compute. 1.
Structured machine learning: the next ten years
, 2008
"... The field of inductive logic programming (ILP) has made steady progress, since the first ILP workshop in 1991, based on a balance of developments in theory, implementations and applications. More recently there has been an increased emphasis on Probabilistic ILP and the related fields of Statistic ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
The field of inductive logic programming (ILP) has made steady progress, since the first ILP workshop in 1991, based on a balance of developments in theory, implementations and applications. More recently there has been an increased emphasis on Probabilistic ILP and the related fields of Statistical Relational Learning (SRL) and Structured Prediction. The goal of the current paper is to consider these emerging trends and chart out the strategic directions and open problems for the broader area of structured machine learning for the next 10 years.
Factorizing YAGO: scalable machine learning for linked data
 In WWW
, 2012
"... Vast amounts of structured information have been published in the Semantic Web’s Linked Open Data (LOD) cloud and their size is still growing rapidly. Yet, access to this information via reasoning and querying is sometimes difficult, due to LOD’s size, partial data inconsistencies and inherent noisi ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Vast amounts of structured information have been published in the Semantic Web’s Linked Open Data (LOD) cloud and their size is still growing rapidly. Yet, access to this information via reasoning and querying is sometimes difficult, due to LOD’s size, partial data inconsistencies and inherent noisiness. Machine Learning offers an alternative approach to exploiting LOD’s data with the advantages that Machine Learning algorithms are typically robust to both noise and data inconsistencies and are able to efficiently utilize nondeterministic dependencies in the data. From a Machine Learning point of view, LOD is challenging due to its relational nature and its scale. Here, we present an efficient approach to relational learning on LOD data, based on the factorization of a sparse tensor that scales to data consisting
Abductive Markov Logic for plan recognition
 In Twentyfifth National Conference on Artificial Intelligence (AAAI
, 2011
"... Plan recognition is a form of abductive reasoning that involves inferring plans that best explain sets of observed actions. Most existing approaches to plan recognition and other abductive tasks employ either purely logical methods that do not handle uncertainty, or purely probabilistic methods that ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Plan recognition is a form of abductive reasoning that involves inferring plans that best explain sets of observed actions. Most existing approaches to plan recognition and other abductive tasks employ either purely logical methods that do not handle uncertainty, or purely probabilistic methods that do not handle structured representations. To overcome these limitations, this paper introduces an approach to abductive reasoning using a firstorder probabilistic logic, specifically Markov Logic Networks (MLNs). It introduces several novel techniques for making MLNs efficient and effective for abduction. Experiments on three plan recognition datasets show the benefit of our approach over existing methods.