## Data-Oriented Models of Parsing and Translation (2005)

### Cached

### Download Links

Citations: | 12 - 2 self |

### BibTeX

@TECHREPORT{Hearne05data-orientedmodels,

author = {Mary Hearne},

title = {Data-Oriented Models of Parsing and Translation},

institution = {},

year = {2005}

}

### OpenURL

### Abstract

A dissertation submitted in fulfilment of the requirements for the award of

### Citations

1554 | BLEU: A method for Automatic Evaluation of Machine Translation
- Papineni, Roukos, et al.
- 2002
(Show Context)
Citation Context ...performed automatic evaluation against reference translations using a metric he defined himself (Poutsma, 2000):58, Poutsma’s work predates the development of the automatic evaluation metrics – Bleu (=-=Papineni et al., 2001-=-, 2002), NIST (NIST, 2002; Doddington, 2002) and F-score (Melamed et al., 2003; Turian et al., 2003)) – currently in use. Poutsma’s metric, termed ‘Largest Translation Part’, is far less sophisticated... |

1316 | Systematical comparison of various statistical alignment models
- Och
- 2003
(Show Context)
Citation Context ...put our results into context, we ran SMT experiments over the same training and test data. 8 For each split x we trained on the same training data trainx (but 8 Training was carried out using Giza++ (=-=Och and Ney, 2003-=-) downloaded from http://www.fjoch.com/GIZA++.html. Translations were generated using the ISI ReWrite Decoder (Germann et al., 2001; Germann, 2003) downloaded from http://www.isi.edu/licensed-sw/rewri... |

674 | An efficient context-free parsing algorithm
- Earley
- 1970
(Show Context)
Citation Context ...osition [0][1]. Standard chart-parsing algorithms that compute the PCFG parse space for a given 25sinput string include the CKY algorithm (Younger, 1967; Aho and Ullman, 1972) and Earley’s algorithm (=-=Earley, 1970-=-; Stolcke, 1995). The CKY algorithm enters rules onto the parse chart in a left-to-right bottom-up manner. This algorithm requires the grammar with which it parses to be in Chomsky-Normal Form (CNF), ... |

460 | Stochastic inversion transduction grammars and bilingual parsing of parallel corpora - Wu - 1997 |

331 | Review Article: Examplebased Machine Translation
- Somers
- 1999
(Show Context)
Citation Context ...stic differences are a different matter. It is generally the case that such systems adhere quite closely to the basic structures of the source language when formulating a target language translation (=-=Hutchins and Somers, 1992-=-). DOT, on the other hand, formulates translations on the basis of the evidence in the fragment base, meaning that it will model stylistic as well as syntactic differences according to the evidence pr... |

330 |
Automatic Evaluation of Machine Translation Quality Using N-Gram Co-Occurrence Statistics
- Doddington
- 2002
(Show Context)
Citation Context ... translations using a metric he defined himself (Poutsma, 2000):58, Poutsma’s work predates the development of the automatic evaluation metrics – Bleu (Papineni et al., 2001, 2002), NIST (NIST, 2002; =-=Doddington, 2002-=-) and F-score (Melamed et al., 2003; Turian et al., 2003)) – currently in use. Poutsma’s metric, termed ‘Largest Translation Part’, is far less sophisticated than these newer metrics, and so it is dif... |

324 | Statistical language modeling using the CMU-cambridge toolkit
- Clarkson, Rosenfeld
- 1997
(Show Context)
Citation Context ...nerated using the ISI ReWrite Decoder (Germann et al., 2001; Germann, 2003) downloaded from http://www.isi.edu/licensed-sw/rewritedecoder/ and the CMU-Cambridge Statistical Language Modeling toolkit (=-=Clarkson and Rosenfeld, 1997-=-) downloaded from http://mi.eng.cam.ac.uk/ prc14/toolkit.html. 163sENGLISH TO FRENCH TRANSLATION Bleu Score NIST Score F-score Exact(%) SMT 0.2686 4.984 0.6203 22.29 DOT (WORST) 0.4168 6.105 0.6513 25... |

300 |
Lexical-Functional Syntax
- Bresnan
- 2001
(Show Context)
Citation Context ...ssumes. It is known that these representations, which reflect surface syntactic phenomena only, do not adequately describe many aspects of human language. The LFG formalism (Kaplan and Bresnan, 1982; =-=Bresnan, 2001-=-; Dalrymple, 2001), on the other hand, is known to be beyond context-free. As its representations encode grammatical features (such as number, case and tense) and identify the grammatical functions of... |

296 | The Penn Treebank: Annotating Predicate Argument Structure
- Marcus, Kim, et al.
- 1994
(Show Context)
Citation Context ...rofound effect on the probability models which result. This issue is discussed in greater detail in section 2.6. 22 D d F f H h B b I iswere performed using sections 2–21 of the WSJ Penn-II treebank (=-=Marcus et al., 1994-=-) for training, section 22 for development and section 23 for testing (as standard). 7 As it was not practical (due to memory limitations) to use all possible fragments for training, all fragments of ... |

271 | Convolution kernels for natural language - Collins, Duffy - 2002 |

268 | A syntax-based statistical translation model
- Yamada, Knight
- 2001
(Show Context)
Citation Context ...rces nor generated further linguistic resources from the data. This situation has, however, changed slightly in order to incorporate information about the structure of language into the models (e.g. (=-=Yamada and Knight, 2001-=-; Charniak et al., 2003; Melamed, 2004)). The translation model of Yamada and Knight (2001) assumes bilingual aligned sentence pairs where each source sentence has been syntactically parsed. The model... |

217 | Pcfg models of linguistic tree representations
- Johnson
- 1998
(Show Context)
Citation Context ...alised PCFG in rule (ii) as it states that likes is followed by an object NP. Both the Data-Oriented Parsing (DOP) (e.g. (Bod, 1998; Bod et al., 2003)) and PCFG approaches to syntactic parsing (e.g. (=-=Johnson, 1999-=-)) are experience-based in that they learn by extrapolating syntactic generalisations, along with their probabilities, from a set of example parses. The DOP methodology is applied in precisely the sam... |

194 | An efficient probabilistic context-free parsing algorithm that computes prefix probabilities
- Stolcke
- 1995
(Show Context)
Citation Context .... Standard chart-parsing algorithms that compute the PCFG parse space for a given 25sinput string include the CKY algorithm (Younger, 1967; Aho and Ullman, 1972) and Earley’s algorithm (Earley, 1970; =-=Stolcke, 1995-=-). The CKY algorithm enters rules onto the parse chart in a left-to-right bottom-up manner. This algorithm requires the grammar with which it parses to be in Chomsky-Normal Form (CNF), i.e. the right-... |

151 |
Beyond Grammar: An Experience-Based Theory of Language
- Bod
- 1998
(Show Context)
Citation Context ...egorisation information for the verb likes is made explicit by this head-lexicalised PCFG in rule (ii) as it states that likes is followed by an object NP. Both the Data-Oriented Parsing (DOP) (e.g. (=-=Bod, 1998-=-; Bod et al., 2003)) and PCFG approaches to syntactic parsing (e.g. (Johnson, 1999)) are experience-based in that they learn by extrapolating syntactic generalisations, along with their probabilities,... |

145 | Stochastic Attribute-Value Grammars - Abney - 1997 |

133 |
Lexical Functional Grammar: a formal system for grammatical representation" in Bresnan (ed
- Kaplan, Bresnan
- 1982
(Show Context)
Citation Context ...orpus representations it assumes. It is known that these representations, which reflect surface syntactic phenomena only, do not adequately describe many aspects of human language. The LFG formalism (=-=Kaplan and Bresnan, 1982-=-; Bresnan, 2001; Dalrymple, 2001), on the other hand, is known to be beyond context-free. As its representations encode grammatical features (such as number, case and tense) and identify the grammatic... |

120 | Fast decoding and optimal decoding for machine translation”, ACL
- Germann
- 2001
(Show Context)
Citation Context ...same training data trainx (but 8 Training was carried out using Giza++ (Och and Ney, 2003) downloaded from http://www.fjoch.com/GIZA++.html. Translations were generated using the ISI ReWrite Decoder (=-=Germann et al., 2001-=-; Germann, 2003) downloaded from http://www.isi.edu/licensed-sw/rewritedecoder/ and the CMU-Cambridge Statistical Language Modeling toolkit (Clarkson and Rosenfeld, 1997) downloaded from http://mi.eng... |

112 |
Enriching Linguistics with Statistics : Performance Models of Natural Language
- Bod
- 1995
(Show Context)
Citation Context ...o use Viterbi optimisation when computing the MPP as sub-derivations which are less likely are pruned from the chart regardless of the fact that their probabilities may contribute to that of the MPP (=-=Bod, 1995-=-b). As it is not feasible to compute exactly the MPP for DOP and it may be less computationally expensive to compute exactly the MPD than to approximate the MPP using random sampling, it is important ... |

107 | Valence induction with a head-lexicalized PCFG - Carroll, Rooth - 1998 |

103 | Computational Complexity of Probabilistic Disambiguation by means of Tree Grammars - SIMA’AN - 1996 |

97 | Parsing algorithms and metrics
- Goodman
- 1996
(Show Context)
Citation Context ...problematic, as is the computation of exact sampling probabilities during disambiguation. 209s7.5.1 Computing the LFG-DOP parse space PCFG-reduction for LFG-DOP Applying the PCFG-reduction method of (=-=Goodman, 1996-=-a, 1998, 2003) to the implementation of LFG-DOP is problematic, not just because the PCFG-reduction must characterise fragments which comprise f-structures as well as phrase-structure trees, but becau... |

86 | Parsing Inside-Out
- Goodman
- 1998
(Show Context)
Citation Context ...1995b; Sima’an, 1995a; Bod, 2001, 2003b)). We also describe the solutions which have been developed to date to address the tasks of building the DOP parse space for an input string (e.g. (Bod, 1995a; =-=Goodman, 1998-=-; Sima’an, 1999)) and selecting the best parse from that space according to the model (e.g. (Bod, 1995a, 2000e; Chappelier and Rajman, 2003)). Finally, we present alternatives to the DOP fragment prob... |

81 |
Memorybased language processing
- Daelemans, Bosch
- 2005
(Show Context)
Citation Context ...an only be used to estimate PDOP of parse tree P if the following condition is fulfilled: 13 de Pauw (2003) presents an approximation of the DOP model through Memory-Based Language Processing (MBLP) (=-=Daelemans, 1999-=-), in which the memory-based aspect of the model is exploited. Under this model, a parse forest for each input string is generated using the grammar underlying the training treebank. The best parse is... |

76 |
A computational model of language performance: Data oriented parsing
- BOD
- 1992
(Show Context)
Citation Context ...d problem. One possible solution to this challenge is the Data-Oriented Translation (DOT) model originally proposed by Poutsma (1998, 2000, 2003), which is based on Data-Oriented Parsing (DOP) (e.g. (=-=Bod, 1992-=-; Bod et al., 2003)) and combines examples, linguistic information and a statistical translation model. In this thesis, we seek to establish how the DOT model of translation relates to the other main ... |

68 | Statistical machine translation by parsing
- Melamed
- 2004
(Show Context)
Citation Context ...from the data. This situation has, however, changed slightly in order to incorporate information about the structure of language into the models (e.g. (Yamada and Knight, 2001; Charniak et al., 2003; =-=Melamed, 2004-=-)). The translation model of Yamada and Knight (2001) assumes bilingual aligned sentence pairs where each source sentence has been syntactically parsed. The model transforms a source-language parse tr... |

60 | Efficient algorithms for parsing the DOP model
- Goodman
- 1996
(Show Context)
Citation Context ...problematic, as is the computation of exact sampling probabilities during disambiguation. 209s7.5.1 Computing the LFG-DOP parse space PCFG-reduction for LFG-DOP Applying the PCFG-reduction method of (=-=Goodman, 1996-=-a, 1998, 2003) to the implementation of LFG-DOP is problematic, not just because the PCFG-reduction must characterise fragments which comprise f-structures as well as phrase-structure trees, but becau... |

59 | A probabilistic corpus-driven model for lexical functional analysis
- Bod, Kaplan
- 1998
(Show Context)
Citation Context ...xical and functional information – with the DOP and DOT models of parsing and translation. We outline the theoretical and empirical work which has been carried out to date on the LFG-DOP model (e.g. (=-=Bod and Kaplan, 1998-=-, 2003)). We show how parameter reestimation techniques developed for the DOP model which assumes phrase-structure trees (Sima’an and Buratto, 2003) can be applied to the LFG-DOP model (Hearne and Sim... |

54 | Evaluation of Machine Translation and its Evaluation
- Turian, Shen, et al.
- 2003
(Show Context)
Citation Context ...sma, 2000):58, Poutsma’s work predates the development of the automatic evaluation metrics – Bleu (Papineni et al., 2001, 2002), NIST (NIST, 2002; Doddington, 2002) and F-score (Melamed et al., 2003; =-=Turian et al., 2003-=-)) – currently in use. Poutsma’s metric, termed ‘Largest Translation Part’, is far less sophisticated than these newer metrics, and so it is difficult to draw meaningful conclusions from the automatic... |

48 | Precision and recall of machine translation
- Melamed, Green, et al.
- 2003
(Show Context)
Citation Context ... defined himself (Poutsma, 2000):58, Poutsma’s work predates the development of the automatic evaluation metrics – Bleu (Papineni et al., 2001, 2002), NIST (NIST, 2002; Doddington, 2002) and F-score (=-=Melamed et al., 2003-=-; Turian et al., 2003)) – currently in use. Poutsma’s metric, termed ‘Largest Translation Part’, is far less sophisticated than these newer metrics, and so it is difficult to draw meaningful conclusio... |

43 |
The Theory of Parsing, Translation and Compiling, volume 1: Parsing
- Aho, Ullman
- 1972
(Show Context)
Citation Context ... left-hand side A and be selected from chart position [0][1]. Standard chart-parsing algorithms that compute the PCFG parse space for a given 25sinput string include the CKY algorithm (Younger, 1967; =-=Aho and Ullman, 1972-=-) and Earley’s algorithm (Earley, 1970; Stolcke, 1995). The CKY algorithm enters rules onto the parse chart in a left-to-right bottom-up manner. This algorithm requires the grammar with which it parse... |

39 |
Language Theory and Language Technology; Competence and Performance (originally
- SCHA
- 1990
(Show Context)
Citation Context ...ional Grammar (LFG) in both theoretical and practical terms. The following gives a more detailed description of the material we present. Chapter 2 Data-Oriented Parsing (DOP) was first introduced in (=-=Scha, 1990-=-; Bod, 1992). In this chapter, we give an overview of the state of the art for this model. Firstly, we focus on the general characteristics of DOP by looking at the types of dependencies captured, and... |

36 | Parsing with the shortest derivation - Bod - 2000 |

33 | The DOP estimation method is biased and inconsistent
- Johnson
(Show Context)
Citation Context ...ly, we present alternatives to the DOP fragment probability estimation method (e.g. (Bonnema et al., 2000; Sima’an and Buratto, 2003)) which has been shown to be unsatisfactory (Bonnema et al., 2000; =-=Johnson, 2002-=-). Chapter 3 In this chapter, we present the DOP system we have developed in terms of implementation and performance. Firstly, we describe the algorithms used to implement each component of our parser... |

32 | An optimized algorithm for Data Oriented Parsing - Sima'an - 1996 |

30 | An efficient implementation of a new DOP model - Bod - 2003 |

29 |
Syntax-based language models for machine translation
- Charniak, Knight, et al.
- 2003
(Show Context)
Citation Context ...r linguistic resources from the data. This situation has, however, changed slightly in order to incorporate information about the structure of language into the models (e.g. (Yamada and Knight, 2001; =-=Charniak et al., 2003-=-; Melamed, 2004)). The translation model of Yamada and Knight (2001) assumes bilingual aligned sentence pairs where each source sentence has been syntactically parsed. The model transforms a source-la... |

27 | What is the minimal set of fragments that achieves maximal parse accuracy
- Bod
- 2001
(Show Context)
Citation Context ... and the required probability model. We discuss pruning techniques which have been proposed to reduce grammar size and how these reductions impact on parse accuracy (e.g. (Bod, 1995b; Sima’an, 1995a; =-=Bod, 2001-=-, 2003b)). We also describe the solutions which have been developed to date to address the tasks of building the DOP parse space for an input string (e.g. (Bod, 1995a; Goodman, 1998; Sima’an, 1999)) a... |

27 | Seeing the Wood for the Trees: Data-Oriented Translation
- Hearne, Way
- 2003
(Show Context)
Citation Context ...replace the notion of fragment depth – the greatest number of steps taken to get from the root node to any frontier node – with the notion of link depth for fragments comprising linked subtree pairs (=-=Hearne and Way, 2003-=-). The link depth of a fragment is the greatest number of steps taken which depart from a linked node to get from the root node to any frontier node. This yields the same result whether calculated ove... |

27 | Learning Efficient Disambiguation - Sima’an - 1999 |

26 | Robust sub-sentential alignment of phrase-structure trees
- Groves, Hearne, et al.
- 2004
(Show Context)
Citation Context ...wledge of both the source and target languages and is, consequently, not an ideal solution to the task of sub-structural alignment. An algorithm to accomplish this task automatically is described in (=-=Groves et al., 2004-=-). Reduced-scale, preliminary experiments on data aligned using this algorithm provide evidence that high-quality translations can also be produced using automatically-induced alignments; we discuss t... |

25 | Data-oriented translation
- Poutsma
- 1998
(Show Context)
Citation Context ...le, remains an unsolved problem to which many solutions are possible. One possible solution to the challenge of developing an optimal hybrid MT framework is the Data-Oriented Translation (DOT) model (=-=Poutsma, 1998-=-, 2000, 2003), which is based on Data-Oriented Parsing (DOP) (e.g. (Bod, 1992; Bod et al., 2003)) and combines examples, linguistic information and a statistical translation model. Studies of this mod... |

24 |
Using Lexicalized Tags for Machine Translation
- Abeill6, Schabes, et al.
- 1990
(Show Context)
Citation Context ...l translation dependencies whereas others – such as the fragment in example (4.2) – are highly specific. In fact, as for DOT, the ‘transfer rules’ of a Synchronous Lexicalised Tree-Adjoining Grammar (=-=Abeillé et al., 1990-=-) comprise linked syntactic subtrees reflecting syntactic and 105sfunctional dependencies not easily captured using localised rewrite rules. However, unlike hand-coded transfer components, DOT shares ... |

24 | Greeedy decoding for statistical machine translation in almost linear time
- Germann
- 2003
(Show Context)
Citation Context ...inx (but 8 Training was carried out using Giza++ (Och and Ney, 2003) downloaded from http://www.fjoch.com/GIZA++.html. Translations were generated using the ISI ReWrite Decoder (Germann et al., 2001; =-=Germann, 2003-=-) downloaded from http://www.isi.edu/licensed-sw/rewritedecoder/ and the CMU-Cambridge Statistical Language Modeling toolkit (Clarkson and Rosenfeld, 1997) downloaded from http://mi.eng.cam.ac.uk/ prc... |

24 |
Automatic evaluation of machine translation quality using n-ram co-occurrence statistics
- NIST
- 2001
(Show Context)
Citation Context ...st reference translations using a metric he defined himself (Poutsma, 2000):58, Poutsma’s work predates the development of the automatic evaluation metrics – Bleu (Papineni et al., 2001, 2002), NIST (=-=NIST, 2002-=-; Doddington, 2002) and F-score (Melamed et al., 2003; Turian et al., 2003)) – currently in use. Poutsma’s metric, termed ‘Largest Translation Part’, is far less sophisticated than these newer metrics... |

23 | Efficient parsing of DOP with PCFG-reductions - Goodman - 2003 |

22 | A Survey of Formal Grammars and Algorithms for Recognition and Transformation - Vauquois - 1968 |

20 | wEBMT: developing and validating an example-based machine translation system using the world wide web - Way, Gough - 2003 |

19 |
MBT2: A Method for Combining Fragments of Examples
- Sato
- 1995
(Show Context)
Citation Context ...stic information can be imported directly by using aligned annotated text rather than simply aligned sentences. Such systems can, for example, store aligned pairs of word-dependency structures (e.g. (=-=Sato, 1995-=-; Menezes and Richardson, 2003)). In this situation, a parser is used to analyse the input string and the parser output matched against the source representations in the example base. The retrieved ta... |

18 | Combining semantic and syntactic structure for language modeling - Bod - 2000 |

17 | Monte Carlo Parsing - Bod - 1996 |