Results 1 -
3 of
3
Wide coverage parsing with stochastic attribute value grammars
- In Proceedings of the IJCNLP-04 Workshop: Beyond
, 2004
"... Stochastic Attribute Value Grammars (SAVG) provide an attractive framework for syntactic analysis, because they allow the combination of linguistic sophistication with a principled treatment of ambiguity. The paper introduces a widecoverage SAVG for Dutch, known as Alpino, and we show how this SAVG ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
Stochastic Attribute Value Grammars (SAVG) provide an attractive framework for syntactic analysis, because they allow the combination of linguistic sophistication with a principled treatment of ambiguity. The paper introduces a widecoverage SAVG for Dutch, known as Alpino, and we show how this SAVG can be efficiently applied, using a beam search algorithm to recover parses from a shared parse forest. Unlike previous approaches, this algorithm does not place strict locality restrictions on the features used for disambiguation. Experimental results for a number of different corpora suggest that the SAVG framework is applicable for realistically sized grammars and corpora. 1
Learning Efficient Parsing
"... A corpus-based technique is described to improve the efficiency of wide-coverage high-accuracy parsers. By keeping track of the derivation steps which lead to the best parse for a very large collection of sentences, the parser learns which parse steps can be filtered without significant loss in pars ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
A corpus-based technique is described to improve the efficiency of wide-coverage high-accuracy parsers. By keeping track of the derivation steps which lead to the best parse for a very large collection of sentences, the parser learns which parse steps can be filtered without significant loss in parsing accuracy, but with an important increase in parsing efficiency. An interesting characteristic of our approach is that it is self-learning, in the sense that it uses unannotated corpora. 1
Statistical Parsing of Dutch using Maximum Entropy Models with Feature Merging
- In Proceedings of the Natural Language Processing Pacific Rim Symposium
, 2001
"... In this project report we describe work in statistical parsing using the maximum entropy technique and the Alpino language analysis system for Dutch. A major difficulty in this domain is the lack of sucient corpus data available for training. Among other problems, this sparseness of data increases t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this project report we describe work in statistical parsing using the maximum entropy technique and the Alpino language analysis system for Dutch. A major difficulty in this domain is the lack of sucient corpus data available for training. Among other problems, this sparseness of data increases the danger of the model overfitting the training data, making it particularly important that the selection of statistical features upon which to base the model be optimal. To this end we have adapted the notion of feature merging, a means of constructing equivalence classes of statistical features based upon common elements within them. In spite of promising preliminary results, subsequent tests have not enabled us to conclude whether this approach helps the kind of models we are working with.

