Results 1  10
of
14
A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing
"... Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP). This tutorial gives an overview of the technique. We describe example algorithms, descr ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP). This tutorial gives an overview of the technique. We describe example algorithms, describe formal guarantees for the method, and describe practical issues in implementing the algorithms. While our examples are predominantly drawn from the NLP literature, the material should be of general relevance to inference problems in machine learning. A central theme of this tutorial is that Lagrangian relaxation is naturally applied in conjunction with a broad class of combinatorial algorithms, allowing inference in models that go significantly beyond previous work on Lagrangian relaxation for inference in graphical models.
Exact Decoding of Phrasebased Translation Models through Lagrangian Relaxation
 In To appear proc. of EMNLP
, 2011
"... This paper describes an algorithm for exact decoding of phrasebased translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
This paper describes an algorithm for exact decoding of phrasebased translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.
Unified expectation maximization
 In NAACLHLT
, 2012
"... We present a general framework containing a graded spectrum of Expectation Maximization (EM) algorithms called Unified Expectation Maximization (UEM.) UEM is parameterized by a single parameter and covers existing algorithms like standard EM and hard EM, constrained versions of EM such as Constraint ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
We present a general framework containing a graded spectrum of Expectation Maximization (EM) algorithms called Unified Expectation Maximization (UEM.) UEM is parameterized by a single parameter and covers existing algorithms like standard EM and hard EM, constrained versions of EM such as ConstraintDriven Learning (Chang et al., 2007) and Posterior Regularization (Ganchev et al., 2010), along with a range of new EM algorithms. For the constrained inference step in UEM we present an efficient dual projected gradient ascent algorithm which generalizes several dual decomposition and Lagrange relaxation algorithms popularized recently in the NLP literature (Ganchev et al., 2008; Koo et al., 2010; Rush and Collins, 2011). UEM is as efficient and easy to implement as standard EM. Furthermore, experiments on POS tagging, information extraction, and wordalignment show that often the best performing algorithm in the UEM family is a new algorithm that wasn’t available earlier, exhibiting the benefits of the UEM framework. 1
Hierarchical PhraseBased Translation Representations
"... This paper compares several translation representations for a synchronous contextfree grammar parse including CFGs/hypergraphs, finitestate automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortestpa ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
(Show Context)
This paper compares several translation representations for a synchronous contextfree grammar parse including CFGs/hypergraphs, finitestate automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortestpath algorithms that follow. Intersection, shortest path, FSA expansion and RTN replacement algorithms are presented for PDAs. ChinesetoEnglish translation experiments using HiFST and HiPDT, FSA and PDAbased decoders, are presented using admissible (or exact) search, possible for HiFST with compact SCFG rulesets and HiPDT with compact LMs. For large rulesets with large LMs, we introduce a twopass search strategy which we then analyze in terms of search errors and translation performance. 1
Sentence Compression with Joint Structural Inference
"... Sentence compression techniques often assemble output sentences using fragments of lexical sequences such as ngrams or units of syntactic structure such as edges from a dependency tree representation. We present a novel approach for discriminative sentence compression that unifies these notions and ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Sentence compression techniques often assemble output sentences using fragments of lexical sequences such as ngrams or units of syntactic structure such as edges from a dependency tree representation. We present a novel approach for discriminative sentence compression that unifies these notions and jointly produces sequential and syntactic representations for output text, leveraging a compact integer linear programming formulation to maintain structural integrity. Our supervised models permit rich features over heterogeneous linguistic structures and generalize over previous stateoftheart approaches. Experiments on corpora featuring humangenerated compressions demonstrate a 1315 % relative gain in 4gram accuracy over a wellstudied language modelbased compression system. 1
Marginbased Decomposed Amortized Inference
"... Given that structured output prediction is typically performed over entire datasets, one natural question is whether it is possible to reuse computation from earlier inference instances to speed up inference for future instances. Amortized inference has been proposed as a way to accomplish this. In ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Given that structured output prediction is typically performed over entire datasets, one natural question is whether it is possible to reuse computation from earlier inference instances to speed up inference for future instances. Amortized inference has been proposed as a way to accomplish this. In this paper, first, we introduce a new amortized inference algorithm called the Marginbased Amortized Inference, which uses the notion of structured margin to identify inference problems for which previous solutions are provably optimal. Second, we introduce decomposed amortized inference, which is designed to address very large inference problems, where earlier amortization methods become less effective. This approach works by decomposing the output structure and applying amortization piecewise, thus increasing the chance that we can reuse previous solutions for parts of the output structure. These parts are then combined to a global coherent solution using Lagrangian relaxation. In our experiments, using the NLP tasks of semantic role labeling and entityrelation extraction, we demonstrate that with the marginbased algorithm, we need to call the inference engine only for a third of the test examples. Further, we show that the decomposed variant of marginbased amortized inference achieves a greater reduction in the number of inference calls. 1
Exact Sampling and Decoding in HighOrder Hidden Markov Models
"... We present a method for exact optimization and sampling from high order Hidden Markov Models (HMMs), which are generally handled by approximation techniques. Motivated by adaptive rejection sampling and heuristic search, we propose a strategy based on sequentially refining a lowerorder language mod ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
We present a method for exact optimization and sampling from high order Hidden Markov Models (HMMs), which are generally handled by approximation techniques. Motivated by adaptive rejection sampling and heuristic search, we propose a strategy based on sequentially refining a lowerorder language model that is an upper bound on the true model we wish to decode and sample from. This allows us to build tractable variableorder HMMs. The ARPA format for language models is extended to enable an efficient use of the maxbackoff quantities required to compute the upper bound. We evaluate our approach on two problems: a SMSretrieval task and a POS tagging experiment using 5gram models. Results show that the same approach can be used for exact optimization and sampling, while explicitly constructing only a fraction of the total implicit statespace. 1
Largescale exact decoding: The imsttt submission to wmt14
 In Proceedings of the Ninth Workshop on Statistical Machine Translation
, 2014
"... We present the IMSTTT submission to WMT14, an experimental statistical treetotree machine translation system based on the multibottom up tree transducer including rule extraction, tuning and decoding. Thanks to input parse forests and a “no pruning ” strategy during decoding, the obtained tran ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We present the IMSTTT submission to WMT14, an experimental statistical treetotree machine translation system based on the multibottom up tree transducer including rule extraction, tuning and decoding. Thanks to input parse forests and a “no pruning ” strategy during decoding, the obtained translations are competitive. The drawbacks are a restricted coverage of 70 % on test data, in part due to exact input parse tree matching, and a relatively high runtime. Advantages include easy redecoding with a different weight vector, since the full translation forests can be stored after the first decoding pass. 1
TACI: TaxonomyAware Catalog Integration
"... Abstract—A fundamental data integration task faced by online commercial portals and commerce search engines is the integration of products coming from multiple providers to their product catalogs. In this scenario, the commercial portal has its own taxonomy (the “master taxonomy”), while each data p ..."
Abstract
 Add to MetaCart
Abstract—A fundamental data integration task faced by online commercial portals and commerce search engines is the integration of products coming from multiple providers to their product catalogs. In this scenario, the commercial portal has its own taxonomy (the “master taxonomy”), while each data provider organizes its products into a different taxonomy (the “provider taxonomy”). In this paper, we consider the problem of categorizing products from the data providers into the master taxonomy, while making use of the provider taxonomy information. Our approach is based on a taxonomyaware processing step that adjusts the results of a textbased classifier to ensure that products that are close together in the provider taxonomy remain close in the master taxonomy. We formulate this intuition as a structured prediction optimization problem. To the best of our knowledge, this is the first approach that leverages the structure of taxonomies in order to enhance catalog integration. We propose algorithms that are scalable and thus applicable to the large datasets that are typical on the Web. We evaluate our algorithms on realworld data and we show that taxonomyaware classification provides a significant improvement over existing approaches. Index Terms—catalog integration, classification, data mining, taxonomies.
Exact Decoding of PhraseBased Translation Models through Lagrangian Relaxation
, 2011
"... This paper describes an algorithm for exact decoding of phrasebased translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming ..."
Abstract
 Add to MetaCart
This paper describes an algorithm for exact decoding of phrasebased translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.