Results 1 - 10
of
163
A hierarchical phrase-based model for statistical machine translation
- In ACL
, 2005
"... We present a statistical phrase-based translation model that uses hierarchical phrases— phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of ..."
Abstract
-
Cited by 257 (7 self)
- Add to MetaCart
We present a statistical phrase-based translation model that uses hierarchical phrases— phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntaxbased translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical phrasebased model achieves a relative improvement of 7.5 % over Pharaoh, a state-of-the-art phrase-based system. 1
Hierarchical phrase-based translation
- Computational Linguistics
, 2007
"... We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from b ..."
Abstract
-
Cited by 209 (4 self)
- Add to MetaCart
We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from both syntax-based translation and phrase-based translation. We describe our system’s training and decoding methods in detail, and evaluate it for translation speed and translation accuracy. Using BLEU as a metric of translation accuracy, we find that our system performs significantly better than the Alignment Template System, a state-of-the-art phrasebased system. 1.
Scalable inference and training of context-rich syntactic translation models
- In ACL
, 2006
"... Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley ..."
Abstract
-
Cited by 136 (15 self)
- Add to MetaCart
Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley et al., 2004) from aligned tree-string pairs, and present two main extensions of their approach: first, instead of merely computing a single derivation that minimally explains a sentence pair, we construct a large number of derivations that include contextually richer rules, and account for multiple interpretations of unaligned words. Second, we propose probability estimates and a training procedure for weighting these rules. We contrast different approaches on real examples, show that our estimates based on multiple derivations favor phrasal re-orderings that are linguistically better motivated, and establish that our larger rules provide a 3.63 BLEU point increase over minimal rules. 1
A Smorgasbord of Features for Statistical Machine Translation
, 2004
"... We describe a methodology for rapid experimentation in statistical machine translation which we use to add a large number of features to a baseline system exploiting features from a wide range of levels of syntactic representation. Feature ..."
Abstract
-
Cited by 93 (3 self)
- Add to MetaCart
We describe a methodology for rapid experimentation in statistical machine translation which we use to add a large number of features to a baseline system exploiting features from a wide range of levels of syntactic representation. Feature
Discriminative reranking for machine translation
- In HLTNAACL 2004
, 2004
"... This paper describes the application of discriminative reranking techniques to the problem of machine translation. For each sentence in the source language, we obtain from a baseline statistical machine translation system, a ranked nbest list of candidate translations in the target language. We intr ..."
Abstract
-
Cited by 43 (1 self)
- Add to MetaCart
This paper describes the application of discriminative reranking techniques to the problem of machine translation. For each sentence in the source language, we obtain from a baseline statistical machine translation system, a ranked nbest list of candidate translations in the target language. We introduce two novel perceptroninspired reranking algorithms that improve on the quality of machine translation over the baseline system based on evaluation using the BLEU metric. We provide experimental results on the NIST 2003 Chinese-English large data track evaluation. We also provide theoretical analysis of our algorithms and experiments that verify that our algorithms provide state-of-theart performance in machine translation. 1
Bilingual parsing with factored estimation: Using English to parse Korean
- In Proc. of EMNLP
, 2004
"... We describe how simple, commonly understood statistical models, such as statistical dependency parsers, probabilistic context-free grammars, and word-to-word translation models, can be effectively combined into a unified bilingual parser that jointly searches for the best English parse, Korean parse ..."
Abstract
-
Cited by 42 (11 self)
- Add to MetaCart
We describe how simple, commonly understood statistical models, such as statistical dependency parsers, probabilistic context-free grammars, and word-to-word translation models, can be effectively combined into a unified bilingual parser that jointly searches for the best English parse, Korean parse, and word alignment, where these hidden structures all constrain each other. The model used for parsing is completely factored into the two parsers and the TM, allowing separate parameter estimation. We evaluate our bilingual parser on the Penn Korean Treebank and against several baseline systems and show improvements parsing Korean with very limited labeled data. 1
Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus
, 2007
"... This paper presents the first empirical results to our knowledge on learning synchronous grammars that generate logical forms. Using statistical machine translation techniques, a semantic parser based on a synchronous context-free grammar augmented with λ-operators is learned given a set of training ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
This paper presents the first empirical results to our knowledge on learning synchronous grammars that generate logical forms. Using statistical machine translation techniques, a semantic parser based on a synchronous context-free grammar augmented with λ-operators is learned given a set of training sentences and their correct logical forms. The resulting parser is shown to be the best-performing system so far in a database query domain.
Chinese Syntactic Reordering for Statistical Machine Translation
- In Proceedings of EMNLP
, 2007
"... Syntactic reordering approaches are an effective method for handling word-order differences between source and target languages in statistical machine translation (SMT) systems. This paper introduces a reordering approach for translation from Chinese to English. We describe a set of syntactic reorde ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
Syntactic reordering approaches are an effective method for handling word-order differences between source and target languages in statistical machine translation (SMT) systems. This paper introduces a reordering approach for translation from Chinese to English. We describe a set of syntactic reordering rules that exploit systematic differences between Chinese and English word order. The resulting system is used as a preprocessor for both training and test sentences, transforming Chinese sentences to be much closer to English in terms of their word order. We evaluated the reordering approach within the MOSES phrase-based SMT system (Koehn et al., 2007). The reordering approach improved the BLEU score for the MOSES system from 28.52 to 30.86 on the NIST 2006 evaluation data. We also conducted a series of experiments to analyze the accuracy and impact of different types of reordering rules. 1
Learning for Semantic Parsing with Statistical Machine Translation
, 2006
"... We present a novel statistical approach to semantic parsing, WASP, for constructing a complete, formal meaning representation of a sentence. A semantic parser is learned given a set of sentences annotated with their correct meaning representations. The main innovation of WASP is its use of state-of- ..."
Abstract
-
Cited by 34 (8 self)
- Add to MetaCart
We present a novel statistical approach to semantic parsing, WASP, for constructing a complete, formal meaning representation of a sentence. A semantic parser is learned given a set of sentences annotated with their correct meaning representations. The main innovation of WASP is its use of state-of-the-art statistical machine translation techniques. A word alignment model is used for lexical acquisition, and the parsing model itself can be seen as a syntax-based translation model. We show that WASP performs favorably in terms of both accuracy and coverage compared to existing learning methods requiring similar amount of supervision, and shows better robustness to variations in task complexity and word order.
A Decoder for Syntax-based Statistical MT
- PROCEEDINGS OF THE 40TH ANNIVERSARY MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL-02
, 2002
"... This paper describes a decoding algorithm for a syntax-based translation model (Yamada and Knight, 2001). The model has been extended to incorporate phrasal translations as presented here. In contrast to a conventional word-to-word statistical model, a decoder for the syntaxbased model builds ..."
Abstract
-
Cited by 32 (2 self)
- Add to MetaCart
This paper describes a decoding algorithm for a syntax-based translation model (Yamada and Knight, 2001). The model has been extended to incorporate phrasal translations as presented here. In contrast to a conventional word-to-word statistical model, a decoder for the syntaxbased model builds up an English parse tree given a sentence in a foreign language.

