Results 1 - 10
of
18
Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars
- COMPUTATIONAL LINGUISTICS
, 1993
"... ..."
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
- Computational Linguistics
, 2002
"... this article can compute solutions to all four of these problems in a single flamework, with a number of additional advantages over previously presented isolated solutions ..."
Abstract
-
Cited by 155 (5 self)
- Add to MetaCart
this article can compute solutions to all four of these problems in a single flamework, with a number of additional advantages over previously presented isolated solutions
Practical Unification-based Parsing of Natural Language
, 1993
"... The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such gr ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such grammars can be computationally very expensive, and secondly, the observation that many analyses are often assigned to an input, only one of which usually forms the basis of the correct interpretation. The thesis starts by presenting a new unification algorithm, justifies why it is well-suited to practical NL parsing, and describes a bottom-up active chart parser which employs this unification algorithm together with several other novel processing and optimisation techniques. Empirical results demonstrate that an implementation of this parser has significantly better practical
GLR*: A Robust Grammar-Focused Parser for Spontaneously Spoken Language
, 1996
"... The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disflu ..."
Abstract
-
Cited by 40 (9 self)
- Add to MetaCart
The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disfluencies, the looser notion of grammaticality, and the lack of clearly marked sentence boundaries. The contamination of the input with errors of a speech recognizer can further exacerbate these problems. Most natural language parsing algorithms are designed to analyze "clean" grammatical input. Because they reject any input which is found to be ungrammatical in even the slightest way, such parsers are unsuitable for parsing spontaneous speech, where completely grammatical input is the exception more than the rule. This thesis describes GLR*, a parsing system based on Tomita's Generalized LR parsing algorithm, that was designed to be robust to two particular types of extra-grammaticality: noise...
Robust Stochastic Parsing Using the Inside-Outside Algorithm
, 1992
"... this paper, we discuss the application of the Viterbi algorithm and the Baum-Welch algorithm (in wide use for speech recognition) to the parsing problem and describe a recent experiment designed to produce a simple, robust, probabilistic parser which selects an appropriate analysis frequently enough ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
this paper, we discuss the application of the Viterbi algorithm and the Baum-Welch algorithm (in wide use for speech recognition) to the parsing problem and describe a recent experiment designed to produce a simple, robust, probabilistic parser which selects an appropriate analysis frequently enough to be useful and deals effectively with the problem of undergeneration. We focus on the application of these stochastic algorithms here because, although other statistically based approaches have been proposed (e.g. Sampson et al., 1989; Garside & Leech, 1985; Magerman & Marcus, 1991a,b), these appear most promising as they are computationally-tractable (in principle) and well-integrated with formal language / automata theory. The Viterbi algorithm and Baum-Welch algorithm are optimised algorithms (with polynomial computational complexity) which can be used in conjunction with stochastic regular grammars (finite-state automata, i.e. (hidden) markov models, Baum, 1972) and with probabilistic context-free grammars (Baker, 1982; Fujisaki
Extensions to Constraint Dependency Parsing for Spoken Language Processing
- COMPUTER SPEECH AND LANGUAGE
, 1995
"... A text-based and spoken language processing framework based on the Constraint Dependency Grammar (CDG) developed by Maruyama [24, 25] is discussed. The scope of CDG is expanded to allow for the analysis of sentences containing lexically ambiguous words, to allow feature analysis in constraints, and ..."
Abstract
-
Cited by 21 (10 self)
- Add to MetaCart
A text-based and spoken language processing framework based on the Constraint Dependency Grammar (CDG) developed by Maruyama [24, 25] is discussed. The scope of CDG is expanded to allow for the analysis of sentences containing lexically ambiguous words, to allow feature analysis in constraints, and to efficiently process multiple sentence candidates that are likely to arise in spoken language processing. The benefits of the CDG parsing approach are summarized. Additionally, the development of CDG grammars using our grammar tools and parser is discussed.
Integrating Language Models with Speech Recognition
- In Proceedings of the AAAI94 Workshop on the Integration of Natural Language and Speech Processing
, 1994
"... The question of how to integrate language models with speech recognition systems is becoming more important as speech recognition technology matures. For the purposes of this paper, we have classified the level of integration of current and past approaches into three categories: tightly-coupled, loo ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
The question of how to integrate language models with speech recognition systems is becoming more important as speech recognition technology matures. For the purposes of this paper, we have classified the level of integration of current and past approaches into three categories: tightly-coupled, loosely-coupled, or semicoupled systems. We then argue that loose coupling is more appropriate given the current state of the art and given that it allows one to measure more precisely which components of the language model are most important. We will detail how the speech component in our approach interacts with the language model and discuss why we chose our language model. 1 Introduction State of the art speech recognition systems achieve high recognition accuracies only on tasks that have low perplexities. The perplexity of a task is, roughly speaking, the average number of choices at any decision point. The perplexity of a task is at a minimum when the true language model is known and co...
Probabilistic Language Modeling for Generalized LR Parsing
, 1998
"... In this thesis, we introduce probabilistic models to rank the likelihood of resultant parses within the GLR parsing framework. Probabilistic models can also bring about the benefit of reduction of search space, if the models allow prefix probabilities for partial parses. In devising the models, we c ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
In this thesis, we introduce probabilistic models to rank the likelihood of resultant parses within the GLR parsing framework. Probabilistic models can also bring about the benefit of reduction of search space, if the models allow prefix probabilities for partial parses. In devising the models, we carefully observe the nature of GLR parsing, one of the most efficient parsing algorithms in existence, and formalize two probabilistic models with the appropriate use of the parsing context. The context in GLR parsing is provided by the constraints afforded by context-free grammars in generating an LR table (global context), and the constraints of adjoining pre-terminal symbols (local n-gram context).
Dependency Language Modeling
, 1997
"... This report summarizes the work of the Dependency Language Modeling group at the 1996 Summer Speech Workshop at the Center for Language and Speech Processing at Johns Hopkins University (WS96). We motivate and descibe a novel statistical language model that models the syntactic dependencies between ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This report summarizes the work of the Dependency Language Modeling group at the 1996 Summer Speech Workshop at the Center for Language and Speech Processing at Johns Hopkins University (WS96). We motivate and descibe a novel statistical language model that models the syntactic dependencies between words. The model is formulated in the maximum entropy framework, which expresses statistical constraints on the frequencies of various type of dependencies, as well the standard N-gram statistics. We describe how this model was applied to the recognition of spontaneous English speech from the Switchboard corpus. Due to implementation constraints, only a reduced version of our model could be tested so far. The model gave a modest improvement over an N-gram baseline model. A by-product of the project is the Maximim Entropy Modeling Toolkit (MEMT), a freely available software package for domain-independent maximum entropy modeling. 1 Introduction Current state-of-the-art language models for s...

