## Statistical syntax-directed translation with extended domain of locality (2006)

### Cached

### Download Links

Venue: | In Proc. AMTA 2006 |

Citations: | 92 - 14 self |

### BibTeX

@INPROCEEDINGS{Huang06statisticalsyntax-directed,

author = {Liang Huang},

title = {Statistical syntax-directed translation with extended domain of locality},

booktitle = {In Proc. AMTA 2006},

year = {2006},

pages = {66--73}

}

### Years of Citing Articles

### OpenURL

### Abstract

In syntax-directed translation, the sourcelanguage input is first parsed into a parsetree, which is then recursively converted into a string in the target-language. We model this conversion by an extended treeto-string transducer that has multi-level trees on the source-side, which gives our system more expressive power and flexibility. We also define a direct probability model and use a linear-time dynamic programming algorithm to search for the best derivation. The model is then extended to the general log-linear framework in order to incorporate other features like n-gram language models. We devise a simple-yet-effective algorithm to generate non-duplicate k-best translations for ngram rescoring. Preliminary experiments on English-to-Chinese translation show a significant improvement in terms of translation quality compared to a state-of-theart phrase-based system. 1

### Citations

1274 | The mathematics of statistical machine translation: Parameter estimation - Brown, Pietra, et al. - 1993 |

1011 | Head-driven statistical models for natural language parsing
- Collins
- 2003
(Show Context)
Citation Context ... a smaller search space. In fact, the recursive transfer step can be done by a linear-time algorithm (see Section 5), and the parsing step is also fast with the modern Treebank parsers, for instance (=-=Collins, 1999-=-; Charniak, 2000). In contrast, their decodings are reported to be computationally expensive and Chiang (2005) uses aggressive pruning to make it tractable. There also exists a compromise between thes... |

868 | A maximum-entropy-inspired parser
- Charniak
- 2000
(Show Context)
Citation Context ...ch space. In fact, the recursive transfer step can be done by a linear-time algorithm (see Section 5), and the parsing step is also fast with the modern Treebank parsers, for instance (Collins, 1999; =-=Charniak, 2000-=-). In contrast, their decodings are reported to be computationally expensive and Chiang (2005) uses aggressive pruning to make it tractable. There also exists a compromise between these two approaches... |

671 | An introduction to Tree Adjoining Grammars
- Joshi
- 1987
(Show Context)
Citation Context ...AG with no adjunctions. STSGs and STAGs generate more tree relations than SCFGs, e.g. the nonisomorphic tree pair in Fig. 1. This extra expressive power lies in the extended domain of locality (EDL) (=-=Joshi and Schabes, 1997-=-), i.e., elementary structures beyond the scope of one-level contextfree productions. Besides being linguistically motivated, the need for EDL is also supported by empirical findings in MT that one-le... |

477 | Stochastic inversion transduction grammars and bilingual parsing of parallel corpora - Wu - 1997 |

418 | Discriminative training and maximum entropy models for statistical machine translation - Och, Ney - 2002 |

400 | A hierarchical phrase-based model for statistical machine translation - Chiang - 2005 |

387 |
The alignment template approach to statistical machine translation
- Och, Ney
- 2004
(Show Context)
Citation Context ..., respectively, and get the completed Chinese string in (e). ◦ ◦s2 Previous Work It is helpful to compare this approach with recent efforts in statistical MT. Phrase-based models (Koehn et al., 2003; =-=Och and Ney, 2004-=-) are good at learning local translations that are pairs of (consecutive) sub-strings, but often insufficient in modeling the reorderings of phrases themselves, especially between language pairs with ... |

243 | What’s in a translation rule
- Galley, Hopkins, et al.
- 2004
(Show Context)
Citation Context ...cope of one-level contextfree productions. Besides being linguistically motivated, the need for EDL is also supported by empirical findings in MT that one-level rules are often inadequate (Fox, 2002; =-=Galley et al., 2004-=-). Similarly, in the tree-transducer terminology, Graehl and Knight (2004) define extended tree transducers that have multi-level trees on the source-side. Since syntax-directed translation models sep... |

240 |
Automata" Akadémiai Kiadó
- Gécsec, Steinby
- 1984
(Show Context)
Citation Context ...tic tree. In this context, a syntax-directed translator consists of two components, a sourcelanguage parser and a recursive converter which is usually modeled as a top-down tree-to-string transducer (=-=Gécseg and Steinby, 1984-=-). This paper adapts the idea of syntax-directed translation to statistical machine translation (MT). We apply stochastic operations at each node of the source-language parse-tree and search for the b... |

155 | Better k-best parsing - Huang, Chiang - 2005 |

148 | Dependency treelet translation: Syntactically informed phrasal SMT
- Quirk, Menezes, et al.
- 2005
(Show Context)
Citation Context ...(Yamada and Knight, 2001), is not enough to capture a common construction like this which is five levels deep (from VP to “by”). Recent works on dependency-based MT (Lin, 2004; Ding and Palmer, 2005; =-=Quirk et al., 2005-=-) are closest to this work in the sense that their translations are also based on source-language parse trees. The difference is that they use dependency trees instead of constituent trees. Although t... |

111 | Training tree transducers
- Graehl, Knight
- 2004
(Show Context)
Citation Context ...closer to the semantic representation. 3 Extended Tree-to-String Tranducers In this section, we define the formal machinery of our recursive transformation model as a special case of xRs transducers (=-=Graehl and Knight, 2004-=-) that has only one state, and each rule is linear (L) and non-deleting (N) with regards to variables in the source and target sides (hence the name 1-xRLNs). Definition 1. A 1-xRLNs transducer is a t... |

96 | restructuring for statistical machine translation - Clause |

94 |
Optimal code generation for expression trees
- AHO, JOHNSON
- 1976
(Show Context)
Citation Context ...oportional to the input length. The full pseudo-code is worked out in Algorithm 1. A restricted version of this algorithm first appears in compiling for optimal code generation from expression-trees (=-=Aho and Johnson, 1976-=-). In computational linguistics, the bottom-up version of this algorithm resembles the tree parsing algorithm for TSG by Eisner (2003). Similar algorithms have also been proposed for dependency-based ... |

81 | Intoduction to Algorithms - Rivest - 1992 |

76 |
Syntax-directed transduction
- Lewis, Stearns
- 1968
(Show Context)
Citation Context ...ovement in terms of translation quality compared to a state-of-theart phrase-based system. 1 Introduction The concept of syntax-directed translation was originally proposed in compiling (Irons, 1961; =-=Lewis and Stearns, 1968-=-; Aho and Ullman, 1972), where the source program is parsed into a tree representation that guides the generation of the object code. In other words, the translation is directed by a syntactic tree. I... |

40 |
A syntax directed compiler for ALGOL 60
- Irons
- 1961
(Show Context)
Citation Context ...nificant improvement in terms of translation quality compared to a state-of-theart phrase-based system. 1 Introduction The concept of syntax-directed translation was originally proposed in compiling (=-=Irons, 1961-=-; Lewis and Stearns, 1968; Aho and Ullman, 1972), where the source program is parsed into a tree representation that guides the generation of the object code. In other words, the translation is direct... |

31 | A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation
- Kumar, Byrne
- 2003
(Show Context)
Citation Context ...rings of phrases themselves, especially between language pairs with very different word-order. This is because the generative capacity of these models lies within the realm of finite-state machinery (=-=Kumar and Byrne, 2003-=-), which is unable to process nested structures and long-distance dependencies in natural languages. Syntax-based models aim to alleviate this problem by exploiting the power of synchronous rewriting ... |

26 |
The Theory of Parsing, Translation and Compiling, volume I: Parsing
- Aho, Ullman
- 1972
(Show Context)
Citation Context ...lation quality compared to a state-of-theart phrase-based system. 1 Introduction The concept of syntax-directed translation was originally proposed in compiling (Irons, 1961; Lewis and Stearns, 1968; =-=Aho and Ullman, 1972-=-), where the source program is parsed into a tree representation that guides the generation of the object code. In other words, the translation is directed by a syntactic tree. In this context, a synt... |

26 | The impact of parse quality on syntacticallyinformed statistical machine translation
- Quirk, Corston-Oliver
- 2006
(Show Context)
Citation Context ...e search, rather than k-best rescoring. Besides, we will extend this work to translating the top k parse trees, instead of committing to the 1-best tree, as parsing errors affect translation quality (=-=Quirk and Corston-Oliver, 2006-=-). Acknowledgements The authors wish to thank Wei Wang, Radu Soricut and Steve Deneefe for help with data and tools. We are also grateful to David Chiang, Yuan Ding, Jonathan Graehl, Daniel Marcu, Mit... |

21 | A better n-best list: Practical determinization of weighted finite tree automata - May, Knight - 2006 |

20 | A path-based transfer model for machine translation
- Lin
- 2004
(Show Context)
Citation Context ...if informed by the Treebank as in (Yamada and Knight, 2001), is not enough to capture a common construction like this which is five levels deep (from VP to “by”). Recent works on dependency-based MT (=-=Lin, 2004-=-; Ding and Palmer, 2005; Quirk et al., 2005) are closest to this work in the sense that their translations are also based on source-language parse trees. The difference is that they use dependency tre... |

19 | An Efficient Algorithm for the N-BestString Problem - Mohri, Riley - 2002 |

7 |
Machine translation using probablisitic synchronous dependency insertion grammars
- Ding, Palmer
- 2005
(Show Context)
Citation Context ... by the Treebank as in (Yamada and Knight, 2001), is not enough to capture a common construction like this which is five levels deep (from VP to “by”). Recent works on dependency-based MT (Lin, 2004; =-=Ding and Palmer, 2005-=-; Quirk et al., 2005) are closest to this work in the sense that their translations are also based on source-language parse trees. The difference is that they use dependency trees instead of constitue... |