## Recovering Latent Information in Treebanks (2002)

### Cached

### Download Links

- [www.cis.upenn.edu]
- [acl.ldc.upenn.edu]
- [wing.comp.nus.edu.sg]
- [www.aclweb.org]
- [www.aclweb.org]
- [aclweb.org]
- [aclweb.org]
- [ucrel.lancs.ac.uk]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of COLING 2002 |

Citations: | 45 - 2 self |

### BibTeX

@INPROCEEDINGS{Chiang02recoveringlatent,

author = {David Chiang and Daniel M. Bikel},

title = {Recovering Latent Information in Treebanks},

booktitle = {In Proceedings of COLING 2002},

year = {2002},

pages = {183--189}

}

### Years of Citing Articles

### OpenURL

### Abstract

Many recent statistical parsers rely on a preprocessing step which uses hand-written, corpus-specific rules to augment the training data with extra information. For example, head-finding rules are used to augment node labels with lexical heads. In this paper, we provide machinery to reduce the amount of human e#ort needed to adapt existing models to new corpora: first, we propose a flexible notation for specifying these rules that would allow them to be shared by di#erent models; second, we report on an experiment to see whether we can use ExpectationMaximization to automatically fine-tune a set of hand-written rules to a particular corpus.

### Citations

8089 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...r the development of a statistical parser. A + indicates augmentation. are performed by existing statistical parsers that we have examined. Second, we explore a novel use of Expectation-Maximization (=-=Dempster et al., 1977-=-) that iteratively reestimates a parsing model using the augmenting heuristics as a starting point. Specifically, the EM algorithm we use is a variant of the Inside-Outside algorithm (Baker, 1979; Lar... |

2105 | Building a Large Annotated Corpus of English: The Penn Treebank
- Marcus, Marcinkiewicz, et al.
- 1993
(Show Context)
Citation Context ...rtunately, this will necessarily vary somewhat across treebanks: all we can define that is truly treebank-independent is the # pattern, which matches any label. For Penn Treebank II style annotation (=-=Marcus et al., 1993-=-), in which a nonterminal symbol is a category together with zero or more functional tags, we adopt the following scheme: the atomic pattern a matches any label with category a or functional tag a; mo... |

953 | Head-Driven Statistical Models for Natural Language Parsing
- Collins
- 1999
(Show Context)
Citation Context ...VP(caught--VBD) VBD caught NP(ball--NN) DET the NN ball Figure 2: A simple lexicalized parse tree. criminative models described in (Magerman, 1995; Ratnaparkhi, 1997), the lexicalized PCFG models in (=-=Collins, 1999-=-), the generative model in (Charniak, 2000), the lexicalized TAG extractor in (Xia, 1999) and the stochastic lexicalized TAG models in (Chiang, 2000; Sarkar, 2001; Chen and VijayShanker, 2000). Induci... |

822 | A Maximum-Entropy-Inspired Parser
- Charniak
(Show Context)
Citation Context ...ET the NN ball Figure 2: A simple lexicalized parse tree. criminative models described in (Magerman, 1995; Ratnaparkhi, 1997), the lexicalized PCFG models in (Collins, 1999), the generative model in (=-=Charniak, 2000-=-), the lexicalized TAG extractor in (Xia, 1999) and the stochastic lexicalized TAG models in (Chiang, 2000; Sarkar, 2001; Chen and VijayShanker, 2000). Inducing a lexicalized structure based on heads ... |

632 | Synchronous tree adjoining grammars
- Shieber, Schabes
- 1990
(Show Context)
Citation Context ...union operator #, the disjunction operator # has no preference for its first argument. by Chiang (2000). TIG (Schabes and Waters, 1995) is a weakly-context free restriction of tree adjoining grammar (=-=Joshi and Schabes, 1997-=-), in which tree fragments called elementary trees are combined by two composition operations, substitution and adjunction (see Figure 3). In TIG there are certain restrictions on the adjunction opera... |

373 |
The estimation of stochastic context-free grammars using the Insideâ€“Outside algorithm. Computer Speech and Language
- Lari, Young
- 1990
(Show Context)
Citation Context ...977) that iteratively reestimates a parsing model using the augmenting heuristics as a starting point. Specifically, the EM algorithm we use is a variant of the Inside-Outside algorithm (Baker, 1979; =-=Lari and Young, 1990-=-; Hwa, 1998). The reestimation adjusts the model's parameters in the augmented parse-tree space to maximize the likelihood of the observed (incomplete) data, in the hopes of finding a better distribut... |

321 | Statistical decision-tree models for parsing
- Magerman
- 1995
(Show Context)
Citation Context ...disS(caught--VBD) NP(boy--NN) DET The NN boy ADVP(also--RB) RB also VP(caught--VBD) VBD caught NP(ball--NN) DET the NN ball Figure 2: A simple lexicalized parse tree. criminative models described in (=-=Magerman, 1995-=-; Ratnaparkhi, 1997), the lexicalized PCFG models in (Collins, 1999), the generative model in (Charniak, 2000), the lexicalized TAG extractor in (Xia, 1999) and the stochastic lexicalized TAG models i... |

279 | Wesichedel R. Nymble: a high-performance learning name-finder
- Bikel, Miller, et al.
- 1997
(Show Context)
Citation Context ...ss are computed as follows: e = # 1 e 1 + (1 - # 1 )(# 2 e 2 + (1 - # 2 )e 3 ) where e i is the estimate of p(X | # i (Y)) for some future context X, and the # i are computed by the formula found in (=-=Bikel et al., 1997-=-), modified to use the multiplicative constant 5 found in the similar formula of (Collins, 1999): # i = # 1 - d i-1 d i # # 1 1 + 5u i /d i # (7) where d i is the number of occurrences in training of ... |

268 |
Trainable grammars for speech recognition
- Baker
- 1979
(Show Context)
Citation Context ...ter et al., 1977) that iteratively reestimates a parsing model using the augmenting heuristics as a starting point. Specifically, the EM algorithm we use is a variant of the Inside-Outside algorithm (=-=Baker, 1979-=-; Lari and Young, 1990; Hwa, 1998). The reestimation adjusts the model's parameters in the augmented parse-tree space to maximize the likelihood of the observed (incomplete) data, in the hopes of find... |

211 | PCFG models of linguistic tree representations
- Johnson
- 1998
(Show Context)
Citation Context ...statistical parsing does not operate in the realm of parse trees as they appear in many treebanks, but rather on trees transformed via augmentation of their node labels, or some other transformation (=-=Johnson, 1998-=-). This methodology is illustrated in Figure 1. The information included in the node labels' augmentations may include lexical items, or a node label su#x to indicate the node is an argument and not a... |

159 | A linear observed time statistical parser based on maximum entropy models
- Ratnaparkhi
- 1997
(Show Context)
Citation Context ...) NP(boy--NN) DET The NN boy ADVP(also--RB) RB also VP(caught--VBD) VBD caught NP(ball--NN) DET the NN ball Figure 2: A simple lexicalized parse tree. criminative models described in (Magerman, 1995; =-=Ratnaparkhi, 1997-=-), the lexicalized PCFG models in (Collins, 1999), the generative model in (Charniak, 2000), the lexicalized TAG extractor in (Xia, 1999) and the stochastic lexicalized TAG models in (Chiang, 2000; Sa... |

77 | Tree insertion grammar: A cubic-time parsable formalism that lexicalizes context-free grammar without changing the trees produced. Technical report, Mitsubishi Electric Research Laboratories
- Schabes, Waters
- 1994
(Show Context)
Citation Context ...hastic tree-insertion grammar (TIG) model described 1 Note that unlike the noncommutative union operator #, the disjunction operator # has no preference for its first argument. by Chiang (2000). TIG (=-=Schabes and Waters, 1995-=-) is a weakly-context free restriction of tree adjoining grammar (Joshi and Schabes, 1997), in which tree fragments called elementary trees are combined by two composition operations, substitution and... |

77 |
Characterizing derivation trees of context-free grammars through a generalization of finite automata theory
- Thatcher
- 1967
(Show Context)
Citation Context ...with such an approach. First, writing down such a grammar would be tedious to say the least, and impossible if we want to handle trees with arbitrary branching factors. So we can use an extended CFG (=-=Thatcher, 1967-=-), a CFG whose right-hand sides are regular expressions. Thus we introduce a union operator (#) and a Kleene star (#) into the syntax for right-hand sides. The second problem that our grammar may be a... |

64 | Automated Extraction of TAGs from the Penn Treebank - Chen, Vijay-Shanker - 2000 |

55 | Applying co-training methods to statistical parsing
- Sarkar
- 2001
(Show Context)
Citation Context ...97), the lexicalized PCFG models in (Collins, 1999), the generative model in (Charniak, 2000), the lexicalized TAG extractor in (Xia, 1999) and the stochastic lexicalized TAG models in (Chiang, 2000; =-=Sarkar, 2001-=-; Chen and VijayShanker, 2000). Inducing a lexicalized structure based on heads has a two-pronged e#ect: it not only allows statistical parsers to be sensitive to lexical information by including this... |

18 | An empirical evaluation of probabilistic lexicalized tree insertion grammars
- Hwa
- 1998
(Show Context)
Citation Context ...reestimates a parsing model using the augmenting heuristics as a starting point. Specifically, the EM algorithm we use is a variant of the Inside-Outside algorithm (Baker, 1979; Lari and Young, 1990; =-=Hwa, 1998-=-). The reestimation adjusts the model's parameters in the augmented parse-tree space to maximize the likelihood of the observed (incomplete) data, in the hopes of finding a better distribution over au... |