Results 1 
4 of
4
Piecewise Training for Structured Prediction
 MACHINE LEARNING
"... A drawback of structured prediction methods is that parameter estimation requires repeated inference, which is intractable for general structures. In this paper, we present an approximate training algorithm called piecewise training that divides the factors into tractable subgraphs, which we call pi ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
A drawback of structured prediction methods is that parameter estimation requires repeated inference, which is intractable for general structures. In this paper, we present an approximate training algorithm called piecewise training that divides the factors into tractable subgraphs, which we call pieces, that are trained independently. Piecewise training can be interpreted as approximating the exact likelihood using belief propagation, and different ways of making this interpretation yield different insights into the method. We also present an extension to piecewise training, called piecewise pseudolikelihood, designed for when variables have large cardinality. On several realworld NLP data sets, piecewise training performs superior to Besag’s pseudolikelihood and sometimes comparably to exact maximum likelihood. In addition, PWPL performs similarly to piecewise and superior to standard pseudolikelihood, but is five to ten times more computationally efficient than batch maximum likelihood training.
Learning Linear Ordering Problems for Better Translation ∗
"... We apply machine learning to the Linear Ordering Problem in order to learn sentencespecific reordering models for machine translation. We demonstrate that even when these models are used as a mere preprocessing step for GermanEnglish translation, they significantly outperform Moses ’ integrated le ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We apply machine learning to the Linear Ordering Problem in order to learn sentencespecific reordering models for machine translation. We demonstrate that even when these models are used as a mere preprocessing step for GermanEnglish translation, they significantly outperform Moses ’ integrated lexicalized reordering model. Our models are trained on automatically aligned bitext. Their form is simple but novel. They assess, based on features of the input sentence, how strongly each pair of input word tokens wi, wj would like to reverse their relative order. Combining all these pairwise preferences to find the best global reordering is NPhard. However, we present a nontrivial O(n3) algorithm, based on chart parsing, that at least finds the best reordering within a certain exponentially large neighborhood. We show how to iterate this reordering process within a local search algorithm, which we use in training. 1
Approved as to style and content by:
, 2008
"... It is customary for doctoral theses to begin with embarrassingly fulsome thanks to one’s advisor, professors, family, and friends. This thesis will be no exception. I am therefore fortunate that I have had a wonderful network of mentors, family, and friends, so that I can express my thanks without r ..."
Abstract
 Add to MetaCart
It is customary for doctoral theses to begin with embarrassingly fulsome thanks to one’s advisor, professors, family, and friends. This thesis will be no exception. I am therefore fortunate that I have had a wonderful network of mentors, family, and friends, so that I can express my thanks without risk of overstatement. I could not have completed this thesis without the help of my advisor Andrew McCallum. Andrew is a dynamo of enthusiasm, encouragement, and interesting suggestions—as anyone who has had him drop by their cubicle can attest. I have been especially influenced by his keen sense of how to find research areas that are theoretically interesting, practically important, and ripe for the attack. I am also grateful to have benefited from the feedback of the other members of my dissertation committee: Sridhar Mahadevan, Erik LearnedMiller, Jonathan Machta, and Tommi Jaakkola. I spent a delightful summer doing an internship at Microsoft Research Cambridge. I thank Tom Minka and Martin Szummer for hosting me, and for our many discussions that helped me to better understand approximate inference and training, and that helped many of the explanations in this thesis to become crisper and more insightful. At UMass, I have also learned much from discussions with Wei Li, Aron Culotta, Greg Druck, Gideon Mann, and Jerod Weinman. I have also had the opportunity to collaborate on several projects outside of my thesis, and for this I thank my
SEARCH AND LEARNING FOR THE LINEAR ORDERING PROBLEM WITH AN APPLICATION TO MACHINE TRANSLATION by
, 2009
"... This dissertation is about ordering. The problem of arranging a set of n items in a desired order is quite common, as well as fundamental to computer science. Sorting is one instance, as is the Traveling Salesman Problem. Each problem instance can be thought of as optimization of a function that app ..."
Abstract
 Add to MetaCart
This dissertation is about ordering. The problem of arranging a set of n items in a desired order is quite common, as well as fundamental to computer science. Sorting is one instance, as is the Traveling Salesman Problem. Each problem instance can be thought of as optimization of a function that applies to the set of permutations. The dissertation treats word reordering for machine translation as another instance of a combinatorial optimization problem. The approach introduced is to combine three different functions of permutations. The first function is based on finitestate automata, the second is an instance of the Linear Ordering Problem, and the third is an entirely new permutation problem related to the LOP. The Linear Ordering Problem has the most attractive computational properties of the three, all of which are NPhard optimization problems. The dissertation expends significant effort developing neighborhoods for local search on the LOP, and uses grammars and other tools from natural language parsing to introduce several new results, including a stateoftheart local search procedure. Combinatorial optimization problems such as the TSP or the LOP are usually