## An Efficient Implementation of a New DOP Model (2003)

### Cached

### Download Links

- [turing.wins.uva.nl]
- [staff.science.uva.nl]
- [acl.ldc.upenn.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In EACL |

Citations: | 30 - 6 self |

### BibTeX

@INPROCEEDINGS{Bod03anefficient,

author = {Rens Bod},

title = {An Efficient Implementation of a New DOP Model},

booktitle = {In EACL},

year = {2003}

}

### Years of Citing Articles

### OpenURL

### Abstract

Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which outperforms each of them separately. Together with a PCFGreduction of DOP we obtain improved accuracy and efficiency on the Wall Street Journal treebank. Our results show an 11% relative reduction in error rate over previous models, and an average processing time of 3.6 seconds per WSJ sentence.

### Citations

991 | Head-driven statistical models for natural language parsing
- Collins
- 2003
(Show Context)
Citation Context ...-words, later models showed the importance of including context from higher nodes in the tree (Charniak 1997; Johnson 1998a). The importance of including nonheadwords has become uncontroversial (e.g. =-=Collins 1999; Charniak-=- 2000; Goodman 1998). And Collins (2000) argues for "keeping track of counts of arbitrary fragments within parse trees", which has indeed been carried out in Collins and Duffy (2002) who use... |

852 | A maximum-entropy-inspired parser
- Charniak
- 2000
(Show Context)
Citation Context ...models showed the importance of including context from higher nodes in the tree (Charniak 1997; Johnson 1998a). The importance of including nonheadwords has become uncontroversial (e.g. Collins 1999; =-=Charniak 2000; Goodman -=-1998). And Collins (2000) argues for "keeping track of counts of arbitrary fragments within parse trees", which has indeed been carried out in Collins and Duffy (2002) who use exactly the sa... |

521 | Three generative, lexicalised models for statistical parsing - Collins - 1997 |

372 | Parsing with context-free grammar and word statistics
- Charniak
- 1995
(Show Context)
Citation Context ...ns. While the models of Collins (1996) and Eisner (1996) restricted the fragments to the locality of head-words, later models showed the importance of including context from higher nodes in the tree (=-=Charniak 1997; Joh-=-nson 1998a). The importance of including nonheadwords has become uncontroversial (e.g. Collins 1999; Charniak 2000; Goodman 1998). And Collins (2000) argues for "keeping track of counts of arbitr... |

275 | Insideoutside reestimation from partially bracketed corpora
- Pereira, Schabes
- 1992
(Show Context)
Citation Context ...predefined grammar and used a corpus only for estimating the rule probabilities (as e.g. in Fujisaki et al. 1989; Black et al. 1992, 1993; Briscoe and 1 Thanks to Ivan Sag for this pun. Waegner 1992; =-=Pereira and Schabes 1992-=-). The DOP model, on the other hand, was the first model (to the best of our knowledge) that proposed not to train a predefined grammar on a corpus, but to directly use corpus fragments as a grammar. ... |

267 | Three new probabilistic models for dependency parsing: An exploration - Eisner - 1996 |

151 |
Beyond Grammar: An Experience-Based Theory of Language
- Bod
- 1998
(Show Context)
Citation Context ...ima'an 1996), mainly because the same parse tree can be generated by exponentially many derivations. Many implementations of DOP1 therefore estimate the most probable parse by Monte Carlo techniques (=-=Bod 1998-=-; Chappelier & Rajman 2000), or by Viterbi n-best search (Bod 2001), or by restricting the set of subtrees (Sima'an 1999; Chappelier et al. 2002). Sima'an (1995) gave an efficient algorithm for comput... |

121 |
Stochastic Lexicalized Tree-Adjoining Grammars
- Schabes
- 1992
(Show Context)
Citation Context ...rammar and used a corpus only for estimating the rule probabilities (as e.g. in Fujisaki et al. 1989; Black et al. 1992, 1993; Briscoe and 1 Thanks to Ivan Sag for this pun. Waegner 1992; Pereira and =-=Schabes 1992-=-). The DOP model, on the other hand, was the first model (to the best of our knowledge) that proposed not to train a predefined grammar on a corpus, but to directly use corpus fragments as a grammar. ... |

103 | Computational Complexity of Probabilistic Disambiguation by means of Tree Grammars
- SIMA’AN
- 1996
(Show Context)
Citation Context ...nition). Bod (1993) showed how standard parsing techniques can be applied to DOP1 by converting subtrees into rules. However, the problem of computing the most probable parse turns out to be NP-hard (=-=Sima'an 1996-=-), mainly because the same parse tree can be generated by exponentially many derivations. Many implementations of DOP1 therefore estimate the most probable parse by Monte Carlo techniques (Bod 1998; C... |

86 | Parsing Inside-Out
- Goodman
- 1998
(Show Context)
Citation Context ...he importance of including context from higher nodes in the tree (Charniak 1997; Johnson 1998a). The importance of including nonheadwords has become uncontroversial (e.g. Collins 1999; Charniak 2000; =-=Goodman 1998). And Col-=-lins (2000) argues for "keeping track of counts of arbitrary fragments within parse trees", which has indeed been carried out in Collins and Duffy (2002) who use exactly the same set of (all... |

59 | Statistically-Driven Computer Grammars of English: The IBM/Lancaster Approach, Rodopi: Amsterdam-Atlanta - Black, Garside, et al. - 1993 |

59 |
Building a Large Annotated
- Mitchell, Santorini, et al.
- 1993
(Show Context)
Citation Context ...- DOP) or the likeliest tree among the n simplest ones (for LS-DOP). In our experiments, n will never be larger than 1,000. 4 Experiments For our experiments we used the standard division of the WSJ (=-=Marcus et al. 1993),-=- with sections 2 through 21 for training (approx. 40,000 sentences) and section 23 for testing (2416 sentences ≤ 100 words); section 22 was used as development set. As usual, all trees were stripped... |

54 | Development and evaluation of a broad-coverage probabilistic grammar of English-language computer manuals
- Black, Lafferty, et al.
- 1992
(Show Context)
Citation Context ...rom all other statistical parsing models at the time. Other models started off with a predefined grammar and used a corpus only for estimating the rule probabilities (as e.g. in Fujisaki et al. 1989; =-=Black et al. 1992-=-, 1993; Briscoe and 1 Thanks to Ivan Sag for this pun. Waegner 1992; Pereira and Schabes 1992). The DOP model, on the other hand, was the first model (to the best of our knowledge) that proposed not t... |

54 | Taaltheorie en taaltechnologie; competence en performance - Scha - 1990 |

49 | Bilexical grammars and a cubictime probabilistic parser - Eisner - 1997 |

41 | Robust stochastic parsing using the inside-outside algorithm - Briscoe, Waegner - 1992 |

36 | Parsing with the shortest derivation - Bod - 2000 |

33 | The DOP estimation method is biased and inconsistent
- Johnson
(Show Context)
Citation Context ...odels of Collins (1996) and Eisner (1996) restricted the fragments to the locality of head-words, later models showed the importance of including context from higher nodes in the tree (Charniak 1997; =-=Johnson 1998a). T-=-he importance of including nonheadwords has become uncontroversial (e.g. Collins 1999; Charniak 2000; Goodman 1998). And Collins (2000) argues for "keeping track of counts of arbitrary fragments ... |

32 | An optimized algorithm for Data Oriented Parsing - Sima'an - 1996 |

30 | A unified model of structural organization in language and music - Bod - 2002 |

28 | Using an Annotated Language Corpus as a Virtual Stochastic Grammar - Bod - 1993 |

27 |
Learning Efficient Disambiguation
- Sima’an
- 1999
(Show Context)
Citation Context ...tions of DOP1 therefore estimate the most probable parse by Monte Carlo techniques (Bod 1998; Chappelier & Rajman 2000), or by Viterbi n-best search (Bod 2001), or by restricting the set of subtrees (=-=Sima'an 1999-=-; Chappelier et al. 2002). Sima'an (1995) gave an efficient algorithm for computing the parse tree generated by the most probable derivation, which in some cases is a reasonable approximation of the m... |

25 | Statistical parsing with an automatically extracted tree adjoining grammar - Chiang - 2003 |

23 | Efficient parsing of DOP with PCFG-reductions - Goodman - 2003 |

21 |
A Probabilistic Method for Sentence Disambiguation
- Fujisaki, Jelinek, et al.
(Show Context)
Citation Context ... radically different from all other statistical parsing models at the time. Other models started off with a predefined grammar and used a corpus only for estimating the rule probabilities (as e.g. in =-=Fujisaki et al. 1989-=-; Black et al. 1992, 1993; Briscoe and 1 Thanks to Ivan Sag for this pun. Waegner 1992; Pereira and Schabes 1992). The DOP model, on the other hand, was the first model (to the best of our knowledge) ... |

18 | Combining semantic and syntactic structure for language modeling - Bod - 2000 |

12 | New Ranking Algorithms for Parsing and Tagging - Collins, Duffy - 2002 |

6 | Discriminative Reranking for Natural Language - Collins - 2000 |

5 | A DOP Model for - Bonnema, Bod, et al. - 1997 |

5 |
Monte Carlo Sampling for NP-hard Maximization
- Chappelier, Rajman
- 2000
(Show Context)
Citation Context ...6), mainly because the same parse tree can be generated by exponentially many derivations. Many implementations of DOP1 therefore estimate the most probable parse by Monte Carlo techniques (Bod 1998; =-=Chappelier & Rajman 2000-=-), or by Viterbi n-best search (Bod 2001), or by restricting the set of subtrees (Sima'an 1999; Chappelier et al. 2002). Sima'an (1995) gave an efficient algorithm for computing the parse tree generat... |

5 |
Efficient Algorithms for Parsing the DOP
- Goodman
- 1996
(Show Context)
Citation Context ...d leaves, internal nonterminals elsewhere have probability 1/a. And subderivations headed by A j with external nonterminals only at the leaves, internal nonterminals elsewhere, have probability 1/aj (=-=Goodman 1996-=-). Goodman's main theorem is that this construction produces PCFG derivations isomorphic to DOP derivations with equal probability. This means that summing up over derivations of a tree in DOP yields ... |

4 |
A New Statistical Parser Based on Bigram
- Collins
- 1996
(Show Context)
Citation Context ...re the two PCFG-reductions in Section 2.2, which we will refer to resp. as Bod01 and Bon99. Table 1 gives the results of these experiments and compares them with some other statistical parsers (resp. =-=Collins 1996, Ch-=-arniak 1997, Collins 1999 and Charniak 2000). Parser LP LR ≤ 40 words Coll96 Char97 Coll99 Char00 Bod01 Bon99 86.3 87.4 88.7 90.1 90.3 86.7 85.8 87.5 88.5 90.1 90.1 86.0 ≤100 words Coll96 85.7 85.... |

3 | What is the Minimal Set of Subtrees that - Bod - 2001 |

3 | Do All Fragments Count? Natural Language Engineering - Bod |

3 |
A New Probability Model for Data-Oriented
- Bonnema, Buying, et al.
- 1999
(Show Context)
Citation Context ...same job with a more compact grammar. 2.2 PCFG-Reductions of Bod (2001) and Bonnema et al. (1999) DOP1 has a serious bias: its subtree estimator provides more probability to nodes with more subtrees (=-=Bonnema et al. 1999-=-). The amount of probability given to two different training nodes depends on how many subtrees they have, and, given that the number of subtrees is an exponential function, this means that some train... |

3 | Polynomial Tree Substitution Grammars: Characterization and new examples
- Chappelier, Rajman, et al.
- 2002
(Show Context)
Citation Context ...therefore estimate the most probable parse by Monte Carlo techniques (Bod 1998; Chappelier & Rajman 2000), or by Viterbi n-best search (Bod 2001), or by restricting the set of subtrees (Sima'an 1999; =-=Chappelier et al. 2002-=-). Sima'an (1995) gave an efficient algorithm for computing the parse tree generated by the most probable derivation, which in some cases is a reasonable approximation of the most probable parse. Good... |

2 |
Data Oriented
- Bod
- 1992
(Show Context)
Citation Context ...re also becoming relevant for theoretical linguistics (see Bod et al. 2003a). 1.2 DOP1 in Retrospective One instantiation of DOP which has received considerable interest is the model known as DOP1 2 (=-=Bod 1992-=-). DOP1 combines subtrees from a treebank by means of node-substitution and computes the probability of a tree from the normalized frequencies of the subtrees (see Section 2 for a full definition). Bo... |

1 | Parsing InsMe-Out - Goodman - 1998 |