## Transliteration as constrained optimization (2008)

### Cached

### Download Links

Venue: | In Proc. EMNLP |

Citations: | 18 - 4 self |

### BibTeX

@INPROCEEDINGS{Goldwasser08transliterationas,

author = {Dan Goldwasser and Dan Roth},

title = {Transliteration as constrained optimization},

booktitle = {In Proc. EMNLP},

year = {2008}

}

### OpenURL

### Abstract

This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration: given two strings (ws, wt) in the source and target language, a classifier is trained to determine if wt is the transliteration of ws. This paper shows that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings. We further explore several methods for learning the objective function of the optimization problem and show the advantage of learning it discriminately. Our experiments show that the new framework results in over 50 % improvement in translating English NEs to Hebrew. 1

### Citations

397 | M.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes
- Ng, Jordan
- 2002
(Show Context)
Citation Context ...sented with a real-valued feature vector instead of a binary vector. This can be viewed as providing a better starting point for the learner, which improves the learning rate (Golding and Roth, 1999; =-=Ng and Jordan, 2001-=-). The weight vector learned by the discriminative training is denoted WD. Given the new weight vector, we can define a new feature extraction operator, that we get by applying the objective function ... |

170 | Learning to Resolve Natural Language Ambiguities: A Uni Approach
- Roth
- 1998
(Show Context)
Citation Context ... training data, which is now used to train a discriminative model. We use a linear classifier trained with a regularized average perceptron update rule (Grove and Roth, 2001) as implemented in SNoW, (=-=Roth, 1998-=-). This learning algorithm provides a simple and general linear classifier that has been demonstrated to work well in other NLP classification tasks, e.g. (Punyakanok et al., 2005), and allows us to i... |

119 | A linear programming formulation for global inference in natural language tasks
- Roth, Yih
- 2004
(Show Context)
Citation Context ... classifier as a way to learn the objective function for the global constrained optimization problem. Our technical approach follows a large body of work developed over the last few years, following (=-=Roth and Yih, 2004-=-) that has formalized global decisions problems in NLP as constrained optimization problems and solved these optimization problems using Integer Linear Programming (ILP) or other methods (Punyakanok e... |

65 |
A Winnow based approach to context-sensitive spelling correction
- Golding, Roth
- 1999
(Show Context)
Citation Context ...mple, the learner is presented with a real-valued feature vector instead of a binary vector. This can be viewed as providing a better starting point for the learner, which improves the learning rate (=-=Golding and Roth, 1999-=-; Ng and Jordan, 2001). The weight vector learned by the discriminative training is denoted WD. Given the new weight vector, we can define a new feature extraction operator, that we get by applying th... |

61 | The necessity of syntactic parsing for semantic role labeling
- Punyakanok, Roth, et al.
- 2005
(Show Context)
Citation Context ...nd Yih, 2004) that has formalized global decisions problems in NLP as constrained optimization problems and solved these optimization problems using Integer Linear Programming (ILP) or other methods (=-=Punyakanok et al., 2005-=-; Barzilay and Lapata, 2006; Clarke and Lapata, ; Marciniak and Strube, 2005). We investigate several ways to train our objective function, which is represented as a dot product between a set of featu... |

30 | Weakly supervised named entity transliteration and discovery from multilingual comparable corpora - Klementiev, Roth - 2006 |

29 | M.: Aggregation via set partitioning for natural language generation
- Barzilay, Lapata
- 2006
(Show Context)
Citation Context ...rmalized global decisions problems in NLP as constrained optimization problems and solved these optimization problems using Integer Linear Programming (ILP) or other methods (Punyakanok et al., 2005; =-=Barzilay and Lapata, 2006-=-; Clarke and Lapata, ; Marciniak and Strube, 2005). We investigate several ways to train our objective function, which is represented as a dot product between a set of features chosen to represent a p... |

28 |
Alignment-Based Discriminative String Similarity
- Bergsma, Kondrak
- 2007
(Show Context)
Citation Context ...apers have followed up on this basic approach and focused on semi-supervised approaches to this problem or on extracting better features for the discriminative classifier (Klementiev and Roth, 2006b; =-=Bergsma and Kondrak, 2007-=-; Goldwasser and Roth, 2008). While it has been clear that the relevancy of pairwise features is context sensitive and that there are contextual constraints among them, the hope was that a discriminat... |

26 | Name translation in statistical machine translation - learning when to transliterate - Hermjakob, Knight, et al. - 2008 |

21 | Linear concepts and hidden variables - Grove, Roth - 2001 |

20 | Modelling compression with discourse constraints - Clarke, Lapata - 2007 |

15 | Unsupervised named entity transliteration using temporal and phonetic correlation
- Tao, Yoon, et al.
- 2006
(Show Context)
Citation Context ...actory, machine learning approaches have been developed to address this problem. The common approach adopted is therefore to view this problem as a classification problem (Klementiev and Roth, 2006a; =-=Tao et al., 2006-=-) and train a discriminative classifier. That is, given two strings, one in the source and the other in the target language, extract pairwise features, and train a classifier that determines if one is... |

11 | DaumeĢ III 2008. Name Translation in Statistical Machine Translation: Learning When to Transliterate - Hermjakob, Knight, et al. |

6 | Active sample selection for named entity transliteration
- Goldwasser, Roth
- 2008
(Show Context)
Citation Context ...his basic approach and focused on semi-supervised approaches to this problem or on extracting better features for the discriminative classifier (Klementiev and Roth, 2006b; Bergsma and Kondrak, 2007; =-=Goldwasser and Roth, 2008-=-). While it has been clear that the relevancy of pairwise features is context sensitive and that there are contextual constraints among them, the hope was that a discriminative approach will be suffic... |

2 | Named entity transliteration and discovery from multilingual comparable corpora - 2006a |