Results 1 
4 of
4
Piecewise Training for Structured Prediction
 MACHINE LEARNING
"... A drawback of structured prediction methods is that parameter estimation requires repeated inference, which is intractable for general structures. In this paper, we present an approximate training algorithm called piecewise training that divides the factors into tractable subgraphs, which we call pi ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
A drawback of structured prediction methods is that parameter estimation requires repeated inference, which is intractable for general structures. In this paper, we present an approximate training algorithm called piecewise training that divides the factors into tractable subgraphs, which we call pieces, that are trained independently. Piecewise training can be interpreted as approximating the exact likelihood using belief propagation, and different ways of making this interpretation yield different insights into the method. We also present an extension to piecewise training, called piecewise pseudolikelihood, designed for when variables have large cardinality. On several realworld NLP data sets, piecewise training performs superior to Besag’s pseudolikelihood and sometimes comparably to exact maximum likelihood. In addition, PWPL performs similarly to piecewise and superior to standard pseudolikelihood, but is five to ten times more computationally efficient than batch maximum likelihood training.
Learning Linear Ordering Problems for Better Translation ∗
"... We apply machine learning to the Linear Ordering Problem in order to learn sentencespecific reordering models for machine translation. We demonstrate that even when these models are used as a mere preprocessing step for GermanEnglish translation, they significantly outperform Moses ’ integrated le ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We apply machine learning to the Linear Ordering Problem in order to learn sentencespecific reordering models for machine translation. We demonstrate that even when these models are used as a mere preprocessing step for GermanEnglish translation, they significantly outperform Moses ’ integrated lexicalized reordering model. Our models are trained on automatically aligned bitext. Their form is simple but novel. They assess, based on features of the input sentence, how strongly each pair of input word tokens wi, wj would like to reverse their relative order. Combining all these pairwise preferences to find the best global reordering is NPhard. However, we present a nontrivial O(n3) algorithm, based on chart parsing, that at least finds the best reordering within a certain exponentially large neighborhood. We show how to iterate this reordering process within a local search algorithm, which we use in training. 1
SEARCH AND LEARNING FOR THE LINEAR ORDERING PROBLEM WITH AN APPLICATION TO MACHINE TRANSLATION by
, 2009
"... This dissertation is about ordering. The problem of arranging a set of n items in a desired order is quite common, as well as fundamental to computer science. Sorting is one instance, as is the Traveling Salesman Problem. Each problem instance can be thought of as optimization of a function that app ..."
Abstract
 Add to MetaCart
This dissertation is about ordering. The problem of arranging a set of n items in a desired order is quite common, as well as fundamental to computer science. Sorting is one instance, as is the Traveling Salesman Problem. Each problem instance can be thought of as optimization of a function that applies to the set of permutations. The dissertation treats word reordering for machine translation as another instance of a combinatorial optimization problem. The approach introduced is to combine three different functions of permutations. The first function is based on finitestate automata, the second is an instance of the Linear Ordering Problem, and the third is an entirely new permutation problem related to the LOP. The Linear Ordering Problem has the most attractive computational properties of the three, all of which are NPhard optimization problems. The dissertation expends significant effort developing neighborhoods for local search on the LOP, and uses grammars and other tools from natural language parsing to introduce several new results, including a stateoftheart local search procedure. Combinatorial optimization problems such as the TSP or the LOP are usually