Results 1 -
7 of
7
GLR*: A Robust Grammar-Focused Parser for Spontaneously Spoken Language
, 1996
"... The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disflu ..."
Abstract
-
Cited by 40 (9 self)
- Add to MetaCart
The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disfluencies, the looser notion of grammaticality, and the lack of clearly marked sentence boundaries. The contamination of the input with errors of a speech recognizer can further exacerbate these problems. Most natural language parsing algorithms are designed to analyze "clean" grammatical input. Because they reject any input which is found to be ungrammatical in even the slightest way, such parsers are unsuitable for parsing spontaneous speech, where completely grammatical input is the exception more than the rule. This thesis describes GLR*, a parsing system based on Tomita's Generalized LR parsing algorithm, that was designed to be robust to two particular types of extra-grammaticality: noise...
Putting Language Into Language Modeling
- In Proc. of Eurospeech-99
, 1999
"... In this paper we describe the statistical Structured Language Model (SLM) that uses grammatical analysis of the hypothesized sentence segment (prefix) to predict the next word. We first describe the operation of a basic, completely lexicalized SLM that builds up partial parses as it proceeds left to ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In this paper we describe the statistical Structured Language Model (SLM) that uses grammatical analysis of the hypothesized sentence segment (prefix) to predict the next word. We first describe the operation of a basic, completely lexicalized SLM that builds up partial parses as it proceeds left to right. We then develop a chart parsing algorithm and with its help a method to compute the prediction probabilities P (w i+1 jW i ): We suggest useful computational shortcuts followed by a method of training SLM parameters from text data. Finally, we introduce more detailed parametrization that involves non-terminal labeling and considerably improves smoothing of SLM statistical parameters. We conclude by presenting certain recognition and perplexity results achieved on standard corpora. 1. INTRODUCTION In the accepted statistical formulation of the speech recognition problem [1] the recognizer seeks to find the word string c W : = arg max W P (AjW)P (W) where A denotes the observab...
On the Use of Grammar Based Language Models for Statistical Machine Translation
- 6th Int. Workshop on Parsing Technologies
, 1999
"... In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target language. The most important demands for the language model in statistical machine translation is the correct word order, given a certain choice of words, and to score the selection of translations, that are done by the translation model Pr(f J 1 je I 1 ), in view of the syntactical context. Beside the inquisition of standard m-grams with long histories, we examined the use of Part-of-Speech based models as well as linguistically motivated grammars with stochastic parsing as a special type of language model. Translation results are given on the Verbmobil task, where translation are performed from German to English, with vocabulary sizes of 6500 and 4000 words respectively. 1 Introduct...
Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model
"... Abstract. In this paper, we present a directed Markov random field model that integrates trigram models, structural language models (SLM) and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. The SLM is essentially a generalization of shift-reduce probab ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. In this paper, we present a directed Markov random field model that integrates trigram models, structural language models (SLM) and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. The SLM is essentially a generalization of shift-reduce probabilistic push-down automata thus more complex and powerful than probabilistic context free grammars (PCFGs). The added context-sensitiveness due to trigrams and PLSAs and violation of tree structure in the topology of the underlying random field model make the inference and parameter estimation problems plausibly intractable, however the analysis of the behavior of the lexical and semantic enhanced structural language model leads to a generalized inside-outside algorithm and thus to rigorous exact EM type re-estimation of the composite language model parameters.
AND
"... This article presents an approach for parsing natural language queries that integrates multiple subparsers and subgrammars, in contrast to the traditional single grammar and parser approach. In using LR(k) parsers for natural language processing, we are faced with the problem of rapid growth in pars ..."
Abstract
- Add to MetaCart
This article presents an approach for parsing natural language queries that integrates multiple subparsers and subgrammars, in contrast to the traditional single grammar and parser approach. In using LR(k) parsers for natural language processing, we are faced with the problem of rapid growth in parsing table sizes as the number of grammar rules increases. We propose to partition the grammar into multiple subgrammars, each having its own parsing table and parser. Grammar partitioning helps reduce the overall parsing table size when compared to using a single grammar. We used the GLR parser with an LR(1) parsing table in our framework because GLR parsers can handle ambiguity in natural language. A parser composition technique then combines the parsers' outputs to produce an overall parse that is the same as the output parse of single parser. Two different strategies were used for parser composition: (i) parser composition by cascading; and (ii) parser composition with predictive pruning. Our experiments were conducted with natural language queries from the ATIS (Air Travel Information Service) domain. We have manually translated the ATIS-3 corpora into Chinese, and consequently we could experiment with grammar partitioning on parallel linguistic corpora. For English, the unpartitioned ATIS grammar has 72,869 states in its parsing table, while the partitioned English grammar has 3,350 states in total. For Chinese, grammar partitioning reduced the overall parsing table size from 29,734 states to 3,894 states. Both results show that grammar partitioning greatly economizes on the overall parsing table size. Language understanding performances were also examined. Parser composition imparts a robust parsing capability in our framework, and hence obtains a higher understanding performance when compared to using a single GLR parser.
Phrase Structure Parsing with Dependency Structure
"... In this paper we present a novel phrase structure parsing approach with the help of dependency structure. Different with existing phrase parsers, in our approach the inference procedure is guided by dependency structure, which makes the parsing procedure flexibly. The experimental results show our a ..."
Abstract
- Add to MetaCart
In this paper we present a novel phrase structure parsing approach with the help of dependency structure. Different with existing phrase parsers, in our approach the inference procedure is guided by dependency structure, which makes the parsing procedure flexibly. The experimental results show our approach is much more accurate. With the help of golden dependency trees, F1 score of our parser achieves 96.08 % on Penn English Treebank and 90.61 % on Penn Chinese Treebank. With the help of N-best dependency trees generated by modified

