Results 1  10
of
64
Dual decomposition for parsing with nonprojective head automata
 In Proc. of EMNLP
, 2010
"... This paper introduces algorithms for nonprojective parsing based on dual decomposition. We focus on parsing algorithms for nonprojective head automata, a generalization of headautomata models to nonprojective structures. The dual decomposition algorithms are simple and efficient, relying on standa ..."
Abstract

Cited by 51 (6 self)
 Add to MetaCart
This paper introduces algorithms for nonprojective parsing based on dual decomposition. We focus on parsing algorithms for nonprojective head automata, a generalization of headautomata models to nonprojective structures. The dual decomposition algorithms are simple and efficient, relying on standard dynamic programming and minimum spanning tree algorithms. They provably solve an LP relaxation of the nonprojective parsing problem. Empirically the LP relaxation is very often tight: for many languages, exact solutions are achieved on over 98 % of test sentences. The accuracy of our models is higher than previous work on a broad range of datasets. 1
Minimizing Sparse Higher Order Energy Functions of Discrete Variables
"... Higher order energy functions have the ability to encode high level structural dependencies between pixels, which have been shown to be extremely powerful for image labeling problems. Their use, however, is severely hampered in practice by the intractable complexity of representing and minimizing su ..."
Abstract

Cited by 49 (9 self)
 Add to MetaCart
Higher order energy functions have the ability to encode high level structural dependencies between pixels, which have been shown to be extremely powerful for image labeling problems. Their use, however, is severely hampered in practice by the intractable complexity of representing and minimizing such functions. We observed that higher order functions encountered in computer vision are very often “sparse”, i.e. many labelings of a higher order clique are equally unlikely and hence have the same high cost. In this paper, we address the problem of minimizing such sparse higher order energy functions. Our method works by transforming the problem into an equivalent quadratic function minimization problem. The resulting quadratic function can be minimized using popular message passing or graph cut based algorithms for MAP inference. Although this is primarily a theoretical paper, it also shows how higher order functions can be used to obtain impressive results for the binary texture restoration problem.
On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing
 In Proc. EMNLP
, 2010
"... This paper introduces dual decomposition as a framework for deriving inference algorithms for NLP problems. The approach relies on standard dynamicprogramming algorithms as oracle solvers for subproblems, together with a simple method for forcing agreement between the different oracles. The approa ..."
Abstract

Cited by 48 (2 self)
 Add to MetaCart
This paper introduces dual decomposition as a framework for deriving inference algorithms for NLP problems. The approach relies on standard dynamicprogramming algorithms as oracle solvers for subproblems, together with a simple method for forcing agreement between the different oracles. The approach provably solves a linear programming (LP) relaxation of the global inference problem. It leads to algorithms that are simple, in that they use existing decoding algorithms; efficient, in that they avoid exact algorithms for the full model; and often exact, in that empirically they often recover the correct solution in spite of using an LP relaxation. We give experimental results on two problems: 1) the combination of two lexicalized parsing models; and 2) the combination of a lexicalized parsing model and a trigram partofspeech tagger. 1
MRF energy minimization and beyond via dual decomposition
 IN: IEEE PAMI. (2011
"... This paper introduces a new rigorous theoretical framework to address discrete MRFbased optimization in computer vision. Such a framework exploits the powerful technique of Dual Decomposition. It is based on a projected subgradient scheme that attempts to solve an MRF optimization problem by first ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
This paper introduces a new rigorous theoretical framework to address discrete MRFbased optimization in computer vision. Such a framework exploits the powerful technique of Dual Decomposition. It is based on a projected subgradient scheme that attempts to solve an MRF optimization problem by first decomposing it into a set of appropriately chosen subproblems and then combining their solutions in a principled way. In order to determine the limits of this method, we analyze the conditions that these subproblems have to satisfy and we demonstrate the extreme generality and flexibility of such an approach. We thus show that, by appropriately choosing what subproblems to use, one can design novel and very powerful MRF optimization algorithms. For instance, in this manner we are able to derive algorithms that: 1) generalize and extend stateoftheart messagepassing methods, 2) optimize very tight LPrelaxations to MRF optimization, 3) and take full advantage of the special structure that may exist in particular MRFs, allowing the use of efficient inference techniques such as, e.g, graphcut based methods. Theoretical analysis on the bounds related with the different algorithms derived from our framework and experimental results/comparisons using synthetic and real data for a variety of tasks in computer vision demonstrate the extreme potentials of our approach.
Learning Bayesian Network Structure using LP Relaxations
"... We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty aris ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
We propose to solve the combinatorial problem of finding the highest scoring Bayesian network structure from data. This structure learning problem can be viewed as an inference problem where the variables specify the choice of parents for each node in the graph. The key combinatorial difficulty arises from the global constraint that the graph structure has to be acyclic. We cast the structure learning problem as a linear program over the polytope defined by valid acyclic structures. In relaxing this problem, we maintain an outer bound approximation to the polytope and iteratively tighten it by searching over a new class of valid constraints. If an integral solution is found, it is guaranteed to be the optimal Bayesian network. When the relaxation is not tight, the fast dual algorithms we develop remain useful in combination with a branch and bound method. Empirical results suggest that the method is competitive or faster than alternative exact methods based on dynamic programming. 1
Energy Minimization for Linear Envelope MRFs
"... Markov random fields with higher order potentials have emerged as a powerful model for several problems in computer vision. In order to facilitate their use, we propose a new representation for higher order potentials as upper and lower envelopes of linear functions. Our representation concisely mod ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
Markov random fields with higher order potentials have emerged as a powerful model for several problems in computer vision. In order to facilitate their use, we propose a new representation for higher order potentials as upper and lower envelopes of linear functions. Our representation concisely models several commonly used higher order potentials, thereby providing a unified framework for minimizing the corresponding Gibbs energy functions. We exploit this framework by converting lower envelope potentials to standard pairwise functions with the addition of a small number of auxiliary variables. This allows us to minimize energy functions with lower envelope potentials using conventional algorithms such as BP, TRW and αexpansion. Furthermore, we show how the minimization of energy functions with upper envelope potentials leads to a difficult minmax problem. We address this difficulty by proposing a new message passing algorithm that solves a linear programming relaxation of the problem. Although this is primarily a theoretical paper, we demonstrate the efficacy of our approach on the binary (fg/bg) segmentation problem. 1.
Exact Decoding of Phrasebased Translation Models through Lagrangian Relaxation
 In To appear proc. of EMNLP
, 2011
"... This paper describes an algorithm for exact decoding of phrasebased translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
This paper describes an algorithm for exact decoding of phrasebased translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.
An LP View of the Mbest MAP problem
"... We consider the problem of finding the M assignments with maximum probability in a probabilistic graphical model. We show how this problem can be formulated as a linear program (LP) on a particular polytope. We prove that, for tree graphs (and junction trees in general), this polytope has a particul ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
We consider the problem of finding the M assignments with maximum probability in a probabilistic graphical model. We show how this problem can be formulated as a linear program (LP) on a particular polytope. We prove that, for tree graphs (and junction trees in general), this polytope has a particularly simple form and differs from the marginal polytope in a single inequality constraint. We use this characterization to provide an approximation scheme for nontree graphs, by using the set of spanning trees over such graphs. The method we present puts the Mbest inference problem in the context of LP relaxations, which have recently received considerable attention and have proven useful in solving difficult inference problems. We show empirically that our method often finds the provably exact M best configurations for problems of high treewidth. A common task in probabilistic modeling is finding the assignment with maximum probability given a model. This is often referred to as the MAP (maximum aposteriori) problem.