Results 1 -
7 of
7
Dual decomposition for parsing with nonprojective head automata
- In Proc. of EMNLP
, 2010
"... This paper introduces algorithms for nonprojective parsing based on dual decomposition. We focus on parsing algorithms for nonprojective head automata, a generalization of head-automata models to non-projective structures. The dual decomposition algorithms are simple and efficient, relying on standa ..."
Abstract
-
Cited by 31 (6 self)
- Add to MetaCart
This paper introduces algorithms for nonprojective parsing based on dual decomposition. We focus on parsing algorithms for nonprojective head automata, a generalization of head-automata models to non-projective structures. The dual decomposition algorithms are simple and efficient, relying on standard dynamic programming and minimum spanning tree algorithms. They provably solve an LP relaxation of the non-projective parsing problem. Empirically the LP relaxation is very often tight: for many languages, exact solutions are achieved on over 98 % of test sentences. The accuracy of our models is higher than previous work on a broad range of datasets. 1
Approximate Inference in Graphical Models using LP Relaxations
, 2010
"... Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the vari ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the variables. We present new approaches to approximate inference based on linear programming (LP) relaxations. Our algorithms optimize over the cycle relaxation of the marginal polytope, which we show to be closely related to the first lifting of the Sherali-Adams hierarchy, and is significantly tighter than the pairwise LP relaxation. We show how to efficiently optimize over the cycle relaxation using a cutting-plane algorithm that iteratively introduces constraints into the relaxation. We provide a criterion to determine which constraints would be most helpful in tightening the relaxation, and give efficient algorithms for solving the search problem of finding the best cycle constraint to add according to this criterion.
An Alternating Direction Method for Dual MAP LP Relaxation
"... Abstract. Maximum a-posteriori (MAP) estimation is an important task in many applications of probabilistic graphical models. Although finding an exact solution is generally intractable, approximations based on linear programming (LP) relaxation often provide good approximate solutions. In this paper ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Maximum a-posteriori (MAP) estimation is an important task in many applications of probabilistic graphical models. Although finding an exact solution is generally intractable, approximations based on linear programming (LP) relaxation often provide good approximate solutions. In this paper we present an algorithm for solving the LP relaxation optimization problem. In order to overcome the lack of strict convexity, we apply an augmented Lagrangian method to the dual LP. The algorithm, based on the alternating direction method of multipliers (ADMM), is guaranteed to converge to the global optimum of the LP relaxation objective. Our experimental results show that this algorithm is competitive with other state-of-the-art algorithms for approximate MAP estimation.
Augmenting Dual Decomposition for MAP Inference
"... In this paper, we propose combining augmented Lagrangian optimization with the dual decomposition method to obtain a fast algorithm for approximate MAP (maximum a posteriori) inference on factor graphs. We also show how the proposed algorithm can efficiently handle problems with (possibly global) st ..."
Abstract
- Add to MetaCart
In this paper, we propose combining augmented Lagrangian optimization with the dual decomposition method to obtain a fast algorithm for approximate MAP (maximum a posteriori) inference on factor graphs. We also show how the proposed algorithm can efficiently handle problems with (possibly global) structural constraints. The experimental results reported testify for the state-of-the-art performance of the proposed approach. 1
Complex Loss Optimization via Dual Decomposition
"... We describe a novel max-margin parameter learning approach for structured prediction problems under certain non-decomposable performance measures. Structured prediction is a common approach in many vision problems. Non-decomposable performance measures are also commonplace. However, efficient genera ..."
Abstract
- Add to MetaCart
We describe a novel max-margin parameter learning approach for structured prediction problems under certain non-decomposable performance measures. Structured prediction is a common approach in many vision problems. Non-decomposable performance measures are also commonplace. However, efficient general methods for learning parameters against non-decomposable performance measures do not exist. In this paper we develop such a method, based on dual decomposition, that is applicable to a large class of non-decomposable performance measures. We exploit dual decomposition to factorize the original hard problem into two smaller problems and show how to optimize each factor efficiently. We show experimentally that the proposed approach significantly outperforms alternatives, which either sacrifice the model structure or approximate the performance measure, and is an order of magnitude faster than a previous approach with comparable results. 1.
eBay Research Labs
"... We present SampleRank, an alternative to contrastive divergence (CD) for estimating parameters in complex graphical models. SampleRank harnesses a user-provided loss function to distribute stochastic gradients across an MCMC chain. As a result, parameter updates can be computed between arbitrary MCM ..."
Abstract
- Add to MetaCart
We present SampleRank, an alternative to contrastive divergence (CD) for estimating parameters in complex graphical models. SampleRank harnesses a user-provided loss function to distribute stochastic gradients across an MCMC chain. As a result, parameter updates can be computed between arbitrary MCMC states. SampleRank is not only faster than CD, but also achieves better accuracy in practice (up to 23% error reduction on noun-phrase coreference). 1.

