## Recovering Latent Information in Treebanks (2002)

Venue: | In Proceedings of COLING 2002 |

Citations: | 45 - 2 self |

@INPROCEEDINGS{Chiang02recoveringlatent,

author = {David Chiang and Daniel M. Bikel},

title = {Recovering Latent Information in Treebanks},

booktitle = {In Proceedings of COLING 2002},

year = {2002},

pages = {183--189}

}

### Abstract

Many recent statistical parsers rely on a preprocessing step which uses hand-written, corpus-specific rules to augment the training data with extra information. For example, head-finding rules are used to augment node labels with lexical heads. In this paper, we provide machinery to reduce the amount of human e#ort needed to adapt existing models to new corpora: first, we propose a flexible notation for specifying these rules that would allow them to be shared by di#erent models; second, we report on an experiment to see whether we can use ExpectationMaximization to automatically fine-tune a set of hand-written rules to a particular corpus.

