Results 1  10
of
10
Factorie: Probabilistic programming via imperatively defined factor graphs
 In Advances in Neural Information Processing Systems 22
, 2009
"... Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to defin ..."
Abstract

Cited by 39 (7 self)
 Add to MetaCart
Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to define these structures in a powerful and flexible way. Rather than using a declarative language, such as SQL or firstorder logic, we advocate using an imperative language to express various aspects of model structure, inference, and learning. By combining the traditional, declarative, statistical semantics of factor graphs with imperative definitions of their construction and operation, we allow the user to mix declarative and procedural domain knowledge, and also gain significant efficiencies. We have implemented such imperatively defined factor graphs in a system we call FACTORIE, a software library for an objectoriented, stronglytyped, functional language. In experimental comparisons to Markov Logic Networks on joint segmentation and coreference, we find our approach to be 315 times faster while reducing error by 2025%—achieving a new state of the art. 1
Samplerank: Learning preference from atomic gradients
 In NIPS WS on Advances in Ranking
, 2009
"... Large templated factor graphs with complex structure that changes during inference have been shown to provide stateoftheart experimental results on tasks such as identity uncertainty and information integration. However, learning parameters in these models is difficult because computing the gradi ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Large templated factor graphs with complex structure that changes during inference have been shown to provide stateoftheart experimental results on tasks such as identity uncertainty and information integration. However, learning parameters in these models is difficult because computing the gradients require expensive inference routines. In this paper we propose an online algorithm that instead learns preferences over hypotheses from the gradients between the atomic steps of inference. Although there are a combinatorial number of ranking constraints over the entire hypothesis space, a connection to the frameworks of sampled convex programs reveals a polynomial bound on the number of rankings that need to be satisfied in practice. We further apply ideas of passive aggressive algorithms to our update rules, enabling us to extend recent work in confidenceweighted classification to structured prediction problems. We compare our algorithm to structured perceptron, contrastive divergence, and persistent contrastive divergence, demonstrating substantial error reductions on two realworld problems (20 % over contrastive divergence).
Online MaxMargin Weight Learning for Markov Logic Networks
"... Most of the existing weightlearning algorithms for Markov Logic Networks (MLNs) use batch training which becomes computationally expensive and even infeasible for very large datasets since the training examples may not fit in main memory. To overcome this problem, previous work has used online lear ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Most of the existing weightlearning algorithms for Markov Logic Networks (MLNs) use batch training which becomes computationally expensive and even infeasible for very large datasets since the training examples may not fit in main memory. To overcome this problem, previous work has used online learning algorithms to learn weights for MLNs. However, this prior work has only applied existing online algorithms, and there is no comprehensive study of online weight learning for MLNs. In this paper, we derive a new online algorithm for structured prediction using the primaldual framework, apply it to learn weights for MLNs, and compare against existing online algorithms on three large, realworld datasets. The experimental results show that our new algorithm generally achieves better accuracy than existing methods, especially on noisy datasets.
Factorie: Efficient probabilistic programming via imperative declarations of structure, inference and learning
 In Neural Information Processing Systems(NIPS) Workshop on Probabilistic Programming
, 2008
"... Discriminatively trained undirected graphical models, or conditional random fields [7], have garnered tremendous interest and empirical success in natural language processing, computer vision, bioinformatics and many other areas [18, 1, 13]. Some of these models use simple structure (e.g. linearcha ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Discriminatively trained undirected graphical models, or conditional random fields [7], have garnered tremendous interest and empirical success in natural language processing, computer vision, bioinformatics and many other areas [18, 1, 13]. Some of these models use simple structure (e.g. linearchains, grids, fullyconnected affinity graphs), but there has been increasing interest in more complex relational structure—capturing more arbitrary dependencies among sets of variables, in repeated patterns. Reimplementing variant structures from scratch is difficult and errorprone, however, and thus there have been several efforts to provide a highlevel language in which new undirected model structures can be specified. These include SQL [19], firstorder logic [15], and others such as Csoft [20]. Regarding logic, for many years there has been considerable effort in integrating firstorder logic and probability [12, 10, 16, 14, 15]. However, we contend that in this combination, the ‘logic’ aspect is mostly a red herring. The power of relational factor graphs is in their repeated relational structure and tied parameters. Firstorder logic is one way to specify this repeated structure although it is not necessarily the best; in fact, its focus on boolean variables, difficulty in representing ifthenelse, inability to address graph problems such as reachability, and confusability to humans [3]
Inference and learning in large factor graphs with adaptive proposal distributions
, 2009
"... Large templated factor graphs with complex structure that changes during inference have been shown to provide stateoftheart experimental results in tasks such as identity uncertainty and information integration. However, inference and learning in these models is notoriously difficult. This paper ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Large templated factor graphs with complex structure that changes during inference have been shown to provide stateoftheart experimental results in tasks such as identity uncertainty and information integration. However, inference and learning in these models is notoriously difficult. This paper formalizes, analyzes and proves convergence for the SampleRank algorithm, which learns extremely efficiently by calculating approximate parameter estimation gradients from each proposed MCMC jump. Next we present a parameterized, adaptive proposal distribution, which greatly increases the number of accepted jumps. We combine these methods in experiments on a realworld information extraction problem and demonstrate that the adaptive proposal distribution requires 27 % fewer jumps than a more traditional proposer. 1
Distributed map inference for undirected graphical models
 In Neural Information Processing Systems (NIPS), Workshop on Learning on Cores, Clusters and Clouds
, 2010
"... Graphical models have widespread uses in information extraction and natural language processing. Recent improvements in approximate inference techniques [1, 2, 3, 4] have allowed exploration of dense models over a large number of variables. These applications include coreference resolution [5, 6], r ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Graphical models have widespread uses in information extraction and natural language processing. Recent improvements in approximate inference techniques [1, 2, 3, 4] have allowed exploration of dense models over a large number of variables. These applications include coreference resolution [5, 6], relation extraction [7], and joint inference [8, 9, 10]. But as the graphs grow to web scale,
Selecting Actions for Resourcebounded Information Extraction using Reinforcement Learning ABSTRACT
"... Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to do so under resource limitations. We formulate the information gathering task as a series of choices among alternative, reso ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to do so under resource limitations. We formulate the information gathering task as a series of choices among alternative, resourceconsuming actions and use reinforcement learning to select the best action at each time step. We use temporal difference qlearning method to train the function that selects these actions, and compare it to an online, errordriven algorithm called SampleRank. We present a system that finds information such as email, job title and department affiliation for the faculty at our university, and show that the learningbased approach accomplishes this task efficiently under a limited action budget. Our evaluations show that we can obtain 92.4 % of the final F1, by only using 14.3% of all possible actions.
Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference
"... Large, relational factor graphs with structure defined by firstorder logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blowup, solutions are often derived from MCM ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Large, relational factor graphs with structure defined by firstorder logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blowup, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these samplingbased methods suffer from local minima—the system must transition through lowerscoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL). Rather than setting parameters to maximize the likelihood of the training data, parameters of the factor graph are treated as a loglinear function approximator and learned with methods of temporal difference (TD); MAP inference is performed by executing the resulting policy on held out test data. Our method allows efficient gradient updates since only factors in the neighborhood of variables affected by an action need to be computed—we bypass the need to compute marginals entirely. Our method yields dramatic empirical success, producing new stateoftheart results on a complex joint model of ontology alignment, with a 48 % reduction in error over stateoftheart in that domain. 1
MAP inference in Large Factor Graphs with Reinforcement Learning
"... Large, relational factor graphs with structure defined by firstorder logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blowup, solutions are often derived from MCM ..."
Abstract
 Add to MetaCart
Large, relational factor graphs with structure defined by firstorder logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blowup, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these samplingbased methods suffer from local minima—the system must transition through lowerscoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL) to model delayed reward with a loglinear function approximation of residual future score improvement. Our method provides dramatic empirical success, producing new stateoftheart results on a complex joint model of ontology alignment, with a 48 % reduction in error over stateoftheart in that domain. 1