Results 1 -
2 of
2
Punctuation: Making a Point in Unsupervised Dependency Parsing
"... We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were wo ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We show how punctuation can be used to improve unsupervised dependency parsing. Our linguistic analysis confirms the strong connection between English punctuation and phrase boundaries in the Penn Treebank. However, approaches that naively include punctuation marks in the grammar (as if they were words) do not perform well with Klein and Manning’s Dependency Model with Valence (DMV). Instead, we split a sentence at punctuation and impose parsing restrictions over its fragments. Our grammar inducer is trained on the Wall Street Journal (WSJ) and achieves 59.5 % accuracy out-of-domain (Brown sentences with 100 or fewer words), more than 6 % higher than the previous best results. Further evaluation, using the 2006/7 CoNLL sets, reveals that punctuation aids grammar induction in 17 of 18 languages, for an overall average net gain of 1.3%. Some of this improvement is from training, but more than half is from parsing with induced constraints, in inference. Punctuation-aware decoding works with existing (even already-trained) parsing models and always increased accuracy in our experiments. 1
Semi-Markov Conditional Random Field with High-Order Features Viet Cuong Nguyen
"... We extend first-order semi-Markov conditional random fields (semi-CRFs) to include higherorder semi-Markov features, and present efficient inference and learning algorithms, under the assumption that the higher-order semi-Markov features are sparse. We empirically demonstrate that high-order semi-CR ..."
Abstract
- Add to MetaCart
We extend first-order semi-Markov conditional random fields (semi-CRFs) to include higherorder semi-Markov features, and present efficient inference and learning algorithms, under the assumption that the higher-order semi-Markov features are sparse. We empirically demonstrate that high-order semi-CRFs outperform high-order CRFs and first-order semi-CRFs on three sequence labeling tasks with long distance dependencies. 1.

