Results 1 - 10
of
10
Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars
- COMPUTATIONAL LINGUISTICS
, 1993
"... ..."
Practical Unification-based Parsing of Natural Language
, 1993
"... The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such gr ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such grammars can be computationally very expensive, and secondly, the observation that many analyses are often assigned to an input, only one of which usually forms the basis of the correct interpretation. The thesis starts by presenting a new unification algorithm, justifies why it is well-suited to practical NL parsing, and describes a bottom-up active chart parser which employs this unification algorithm together with several other novel processing and optimisation techniques. Empirical results demonstrate that an implementation of this parser has significantly better practical
GLR*: A Robust Grammar-Focused Parser for Spontaneously Spoken Language
, 1996
"... The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disflu ..."
Abstract
-
Cited by 40 (9 self)
- Add to MetaCart
The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disfluencies, the looser notion of grammaticality, and the lack of clearly marked sentence boundaries. The contamination of the input with errors of a speech recognizer can further exacerbate these problems. Most natural language parsing algorithms are designed to analyze "clean" grammatical input. Because they reject any input which is found to be ungrammatical in even the slightest way, such parsers are unsuitable for parsing spontaneous speech, where completely grammatical input is the exception more than the rule. This thesis describes GLR*, a parsing system based on Tomita's Generalized LR parsing algorithm, that was designed to be robust to two particular types of extra-grammaticality: noise...
Relating complexity to practical performance in parsing with wide-coverage unification grammars
, 1994
"... The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that tile study and optimisation of unification-based parsing must ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that tile study and optimisation of unification-based parsing must rely on empirical data until complexity theory can more accurately predict the practical behaviour of such parsers.
Probabilistic parsing strategies
- In 42nd Annual Meeting of the Association for Computational Linguistics
, 2004
"... We present new results on the relation between purely symbolic contextfree parsing strategies and their probabilistic counter-parts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two condi ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We present new results on the relation between purely symbolic contextfree parsing strategies and their probabilistic counter-parts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results we also derive negative results on so-called generalized LR parsing. 1
Probabilistic Language Modeling for Generalized LR Parsing
, 1998
"... In this thesis, we introduce probabilistic models to rank the likelihood of resultant parses within the GLR parsing framework. Probabilistic models can also bring about the benefit of reduction of search space, if the models allow prefix probabilities for partial parses. In devising the models, we c ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
In this thesis, we introduce probabilistic models to rank the likelihood of resultant parses within the GLR parsing framework. Probabilistic models can also bring about the benefit of reduction of search space, if the models allow prefix probabilities for partial parses. In devising the models, we carefully observe the nature of GLR parsing, one of the most efficient parsing algorithms in existence, and formalize two probabilistic models with the appropriate use of the parsing context. The context in GLR parsing is provided by the constraints afforded by context-free grammars in generating an LR table (global context), and the constraints of adjoining pre-terminal symbols (local n-gram context).
A Probabilistic Chunker
- In: Proceedings of ROCLING VI
, 1993
"... This paper proposes a probabilistic partial parser, which we call chunker. The chunker partitions the input sentence into segments. This idea is motivated by the fact that when we read a sentence, we read it chunk by chunk. We train the chunker from Susanne Corpus, which is a modified but shrunk ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
This paper proposes a probabilistic partial parser, which we call chunker. The chunker partitions the input sentence into segments. This idea is motivated by the fact that when we read a sentence, we read it chunk by chunk. We train the chunker from Susanne Corpus, which is a modified but shrunk version of Brown Corpus, underlying bi-gram language model. The experiment is evaluated by outside test and inside test. The preliminary results show the chunker has more than 98% chunk correct rate and 94% sentence correct rate in outside test, and 99% chunk correct rate and 97% sentence correct rate in inside test. The simple but effective chunker design has shown to be promising and can be extended to complete parsing and many applications. 1. Introduction A probabilistic approach to natural language processing is not new [1]. Recently, many parsers based on this line have been proposed [2-9]. Garside and Leech [2] apply the constituentlikehood grammar of Atwell [10] to probabilist...
Linguistics With Enriching Statistics: Performance Models Of Natural Language
- University of Amsterdam
, 1995
"... Ó 1995 by Rens Bod. All rights reserved. Printed in the Netherlands by Academische Pers, Amsterdam. Acknowledgements This thesis benefitted from discussions with many people. I would like to express my thanks to Martin van den Berg, Kenneth Church, Marc Dymetman, Bipin Indurkhya, Laszlo Kalman, Ron ..."
Abstract
- Add to MetaCart
Ó 1995 by Rens Bod. All rights reserved. Printed in the Netherlands by Academische Pers, Amsterdam. Acknowledgements This thesis benefitted from discussions with many people. I would like to express my thanks to Martin van den Berg, Kenneth Church, Marc Dymetman, Bipin Indurkhya, Laszlo Kalman, Ronald Kaplan, Martin Kay, Steven Krauwer, Kwee Tjoe Liong, Neza van der Leeuw, David Magerman, Arie Mijnlieff, Fernando Pereira, Philip Resnik, Yves Schabes, Khalil Sima'an and Frederik Somsen. Furthermore, I wish to thank the members of the graduation committee: Renate Bartsch, Jan van Eijck, Gerard Kempen, Chris Klaassen and Anton Nijholt. I am grateful to Steven Krauwer for allowing me to work at this thesis while I was involved in the CLASK project ("Combining Linguistic and Statistical Knowledge") at Utrecht University. The fruitful discussions and positive cooperation with my colleagues Martin van den Berg and Khalil Sima'an have been of incalculable value
A Node-Driven Parse Pruning Technique for Probabilistic GLR Parsing
"... | This paper proposes a new technique, a node-driven parse pruning technique, in pruning the less probable parses for GLR parsing algorithm. Without decreasing the eciency of GLR parsing, this technique estimates the number of parses in the GSS (graph-structured stack) based on the number of expande ..."
Abstract
- Add to MetaCart
| This paper proposes a new technique, a node-driven parse pruning technique, in pruning the less probable parses for GLR parsing algorithm. Without decreasing the eciency of GLR parsing, this technique estimates the number of parses in the GSS (graph-structured stack) based on the number of expanded nodes during the parse process. We show the evaluation results of various beam width settings for pruning, and compare the parse time and space consumption against full parsing results. Our node-driven parse pruning algorithm allows pruning in a left-to-right manner without modifying the GSS. KEY WORDS | Probabilistic GLR parsing, parse pruning, GSS, beam search. | , GLR. GSS, \ ". . GSS . - | GLR , , GSS, . 1 Introduction Pruning is an essential paradigm to reduce the search space in parsing. The idea of pruning is to exclude hypotheses from further investigation if the parses turn out to be unlikely, based on evaluation of partial data. The eciency of pruning technique ...
RELATING COMPLEXITY TO PRACTICAL PERFORMANCE IN PARSING WITH WIDE-COVERAGE UNIFICATION GRAMMARS
, 1994
"... The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that the study and optimisation of unification-based parsing must r ..."
Abstract
- Add to MetaCart
The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that the study and optimisation of unification-based parsing must rely on empirical data until complexity theory can more accurately predict the practical behaviour of such parsers 1. 1.

