Results 1  10
of
89
Joint inference in information extraction
 In Proceedings of the 22nd National Conference on Artificial Intelligence (2007
"... The goal of information extraction is to extract database records from text or semistructured sources. Traditionally, information extraction proceeds by first segmenting each candidate record separately, and then merging records that refer to the same entities. While computationally efficient, this ..."
Abstract

Cited by 78 (8 self)
 Add to MetaCart
The goal of information extraction is to extract database records from text or semistructured sources. Traditionally, information extraction proceeds by first segmenting each candidate record separately, and then merging records that refer to the same entities. While computationally efficient, this approach is suboptimal, because it ignores the fact that segmenting one candidate record can help to segment similar ones. For example, resolving a wellsegmented field with a lessclear one can disambiguate the latter’s boundaries. In this paper we propose a joint approach to information extraction, where segmentation of all records and entity resolution are performed together in a single integrated inference process. While a number of previous authors have taken steps in this direction (e.g., Pasula et al. (2003), Wellner et al. (2004)), to our knowledge this is the first fully joint approach. In experiments on the CiteSeer and Cora citation matching datasets, joint inference improved accuracy, and our approach outperformed previous ones. Further, by using Markov logic and the existing algorithms for it, our solution consisted mainly of writing the appropriate logical formulas, and required much less engineering than previous ones.
Automatically Refining the Wikipedia Infobox Ontology
, 2008
"... The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machineharvestable objectattributevalue triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more se ..."
Abstract

Cited by 66 (7 self)
 Add to MetaCart
The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machineharvestable objectattributevalue triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanlystructured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infoboxclass ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful jointinference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the jointinference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integrating Wikipedia’s infoboxclass schemata with WordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features.
Joint Unsupervised Coreference Resolution with Markov Logic
"... Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised approaches have been proposed (e.g., Haghighi and Klein (2007)), but they are less accurate. In this paper, we present the first unsupervised approach that is competi ..."
Abstract

Cited by 59 (5 self)
 Add to MetaCart
Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised approaches have been proposed (e.g., Haghighi and Klein (2007)), but they are less accurate. In this paper, we present the first unsupervised approach that is competitive with supervised ones. This is made possible by performing joint inference across mentions, in contrast to the pairwise classification typically used in supervised methods, and by using Markov logic as a representation language, which enables us to easily express relations like apposition and predicate nominals. On MUC and ACE datasets, our model outperforms Haghigi and Klein’s one using only a fraction of the training data, and often matches or exceeds the accuracy of stateoftheart supervised models. 1
Efficient weight learning for Markov logic networks
 In Proceedings of the Eleventh European Conference on Principles and Practice of Knowledge Discovery in Databases
, 2007
"... Abstract. Markov logic networks (MLNs) combine Markov networks and firstorder logic, and are a powerful and increasingly popular representation for statistical relational learning. The stateoftheart method for discriminative learning of MLN weights is the voted perceptron algorithm, which is ess ..."
Abstract

Cited by 57 (7 self)
 Add to MetaCart
Abstract. Markov logic networks (MLNs) combine Markov networks and firstorder logic, and are a powerful and increasingly popular representation for statistical relational learning. The stateoftheart method for discriminative learning of MLN weights is the voted perceptron algorithm, which is essentially gradient descent with an MPE approximation to the expected sufficient statistics (true clause counts). Unfortunately, these can vary widely between clauses, causing the learning problem to be highly illconditioned, and making gradient descent very slow. In this paper, we explore several alternatives, from perweight learning rates to secondorder methods. In particular, we focus on two approaches that avoid computing the partition function: diagonal Newton and scaled conjugate gradient. In experiments on standard SRL datasets, we obtain orderofmagnitude speedups, or more accurate models given comparable learning times. 1
Clp(bn): Constraint logic programming for probabilistic knowledge
 In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI03
, 2003
"... Abstract. In Datalog, missing values are represented by Skolem constants. More generally, in logic programming missing values, or existentially quantified variables, are represented by terms built from Skolem functors. The CLP(BN) language represents the joint probability distribution over missing v ..."
Abstract

Cited by 49 (6 self)
 Add to MetaCart
Abstract. In Datalog, missing values are represented by Skolem constants. More generally, in logic programming missing values, or existentially quantified variables, are represented by terms built from Skolem functors. The CLP(BN) language represents the joint probability distribution over missing values in a database or logic program by using constraints to represent Skolem functions. Algorithms from inductive logic programming (ILP) can be used with only minor modification to learn CLP(BN) programs. An implementation of CLP(BN) is publicly available as part of YAP Prolog at
Bottomup learning of Markov logic network structure
 In Proceedings of the TwentyFourth International Conference on Machine Learning
, 2007
"... Markov logic networks (MLNs) are a statistical relational model that consists of weighted firstorder clauses and generalizes firstorder logic and Markov networks. The current stateoftheart algorithm for learning MLN structure follows a topdown paradigm where many potential candidate structures a ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
Markov logic networks (MLNs) are a statistical relational model that consists of weighted firstorder clauses and generalizes firstorder logic and Markov networks. The current stateoftheart algorithm for learning MLN structure follows a topdown paradigm where many potential candidate structures are systematically generated without considering the data and then evaluated using a statistical measure of their fit to the data. Even though this existing algorithm outperforms an impressive array of benchmarks, its greedy search is susceptible to local maxima or plateaus. We present a novel algorithm for learning MLN structure that follows a more bottomup approach to address this problem. Our algorithm uses a “propositional ” Markov network learning method to construct “template” networks that guide the construction of candidate clauses. Our algorithm significantly improves accuracy and learning time over the existing topdown approach in three realworld domains. 1.
Mapping and revising markov logic networks for transfer learning
 In Proceedings of the 22 nd National Conference on Artificial Intelligence (AAAI
, 2007
"... Transfer learning addresses the problem of how to leverage knowledge acquired in a source domain to improve the accuracy and speed of learning in a related target domain. This paper considers transfer learning with Markov logic networks (MLNs), a powerful formalism for learning in relational domains ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
Transfer learning addresses the problem of how to leverage knowledge acquired in a source domain to improve the accuracy and speed of learning in a related target domain. This paper considers transfer learning with Markov logic networks (MLNs), a powerful formalism for learning in relational domains. We present a complete MLN transfer system that first autonomously maps the predicates in the source MLN to the target domain and then revises the mapped structure to further improve its accuracy. Our results in several realworld domains demonstrate that our approach successfully reduces the amount of time and training data needed to learn an accurate model of a target domain over learning from scratch.
Discriminative Structure and Parameter Learning for Markov Logic Networks
"... Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both firstorder logic and graphical models. Existing methods for learning the logical structure of an MLN are not discriminative; however, many relational learning problems involve spe ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both firstorder logic and graphical models. Existing methods for learning the logical structure of an MLN are not discriminative; however, many relational learning problems involve specific target predicates that must be inferred from given background information. We found that existing MLN methods perform very poorly on several such ILP benchmark problems, and we present improved discriminative methods for learning MLN clauses and weights that outperform existing MLN and traditional ILP methods. 1.
Statistical predicate invention
 In Z. Ghahramani (Ed.), Proceedings of the 24’th annual international conference on machine learning (ICML2007
, 2007
"... We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an initial model for SPI based on secondorder Markov logic, in which predicates as well as arguments can be variables, and the domain of discourse is not fully known in advance. Our approach iteratively refines clusters of symbols based on the clusters of symbols they appear in atoms with (e.g., it clusters relations by the clusters of the objects they relate). Since different clusterings are better for predicting different subsets of the atoms, we allow multiple crosscutting clusterings. We show that this approach outperforms Markov logic structure learning and the recently introduced infinite relational model on a number of relational datasets. 1.
A General Method for Reducing the Complexity of Relational Inference And its Application to MCMC
"... Many realworld problems are characterized by complex relational structure, which can be succinctly represented in firstorder logic. However, many relational inference algorithms proceed by first fully instantiating the firstorder theory and then working at the propositional level. The applicabilit ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
Many realworld problems are characterized by complex relational structure, which can be succinctly represented in firstorder logic. However, many relational inference algorithms proceed by first fully instantiating the firstorder theory and then working at the propositional level. The applicability of such approaches is severely limited by the exponential time and memory cost of propositionalization. Singla and Domingos (2006) addressed this by developing a “lazy ” version of the WalkSAT algorithm, which grounds atoms and clauses only as needed. In this paper we generalize their ideas to a much broader class of algorithms, including other types of SAT solvers and probabilistic inference methods like MCMC. Lazy inference is potentially applicable whenever variables and functions have default values (i.e., a value that is much more frequent than the others). In relational domains, the default is false for atoms and true for clauses. We illustrate our framework by applying it to MCSAT, a stateoftheart MCMC algorithm. Experiments on a number of realworld domains show that lazy inference reduces both space and time by several orders of magnitude, making probabilistic relational inference applicable in previously infeasible domains.