Results 1 - 10
of
14
Two decades of statistical language modeling: Where do we go from here
- Proceedings of the IEEE
, 2000
"... Statistical Language Models estimate the distribution of various natural language phenomena for the purpose of speech recognition and other language technologies. Since the first significant model was proposed in 1980, many attempts have been made to improve the state of the art. We review them here ..."
Abstract
-
Cited by 119 (1 self)
- Add to MetaCart
Statistical Language Models estimate the distribution of various natural language phenomena for the purpose of speech recognition and other language technologies. Since the first significant model was proposed in 1980, many attempts have been made to improve the state of the art. We review them here, point to a few promising directions, and argue for a Bayesian approach to integration of linguistic theories with data. 1. OUTLINE Statistical language modeling (SLM) is the attempt to capture regularities of natural language for the purpose of improving the performance of various natural language applications. By and large, statistical language modeling amounts to estimating the probability distribution of various linguistic units, such as words, sentences, and whole documents. Statistical language modeling is crucial for a large variety of language technology applications. These include speech recognition (where SLM got its start), machine translation, document classification and routing, optical character recognition, information retrieval, handwriting recognition, spelling correction, and many more. In machine translation, for example, purely statistical approaches have been introduced in [1]. But even researchers using rule-based approaches have found it beneficial to introduce some elements of SLM and statistical estimation [2]. In information retrieval, a language modeling approach was recently proposed by [3], and a statistical/information theoretical approach was developed by [4]. SLM employs statistical estimation techniques using language training data, that is, text. Because of the categorical nature of language, and the large vocabularies people naturally use, statistical techniques must estimate a large number of parameters, and consequently depend critically on the availability of large amounts of training data.
Probabilistic Top-Down Parsing and Language Modeling
- Computational Linguistics
, 2004
"... This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic topdown parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. A new language model that utilizes probabilistic top-down parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models, demonstrating the degree to which the information captured by our parsing model is orthogonal to that captured by a trigram model. A small recognition experiment also demonstrates the utility of the model
Maximum Entropy Techniques for Exploiting Syntactic, Semantic and Collocational Dependencies in Language Modeling
"... A new statistical language model is presented which combines collocational dependencies with two important sources of long-range statistical dependence: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy technique. Subs ..."
Abstract
-
Cited by 33 (7 self)
- Add to MetaCart
A new statistical language model is presented which combines collocational dependencies with two important sources of long-range statistical dependence: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy technique. Substantial improvements are demonstrated over a trigram model in both perplexity and speech recognition accuracy on the Switchboard task. A detailed analysis of the performance of this language model is provided in order to characterize the manner in which it performs better than a standard N-gram model. It is shown that topic dependencies are most useful in predicting words which are semantically related by the subject matter of the conversation. Syntactic dependencies on the other hand are found to be most helpful in positions where the best predictors of the following word are not within N-gram range due to an intervening phrase or clause. It is also shown that these two methods ind...
Probabilistic Models of Verb-Argument Structure
, 2002
"... We evaluate probabilistic models of verb argument structure trained on a corpus of verbs and their syntactic arguments. Models designed to represent patterns of verb alternation behavior are compared with generic clustering models in terms of the perplexity assigned to held-out test data. While the ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
We evaluate probabilistic models of verb argument structure trained on a corpus of verbs and their syntactic arguments. Models designed to represent patterns of verb alternation behavior are compared with generic clustering models in terms of the perplexity assigned to held-out test data. While the specialized models of alternation do not perform as well, closer examination reveals alternation behavior represented implicitly in the generic models.
Putting Language Into Language Modeling
- In Proc. of Eurospeech-99
, 1999
"... In this paper we describe the statistical Structured Language Model (SLM) that uses grammatical analysis of the hypothesized sentence segment (prefix) to predict the next word. We first describe the operation of a basic, completely lexicalized SLM that builds up partial parses as it proceeds left to ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In this paper we describe the statistical Structured Language Model (SLM) that uses grammatical analysis of the hypothesized sentence segment (prefix) to predict the next word. We first describe the operation of a basic, completely lexicalized SLM that builds up partial parses as it proceeds left to right. We then develop a chart parsing algorithm and with its help a method to compute the prediction probabilities P (w i+1 jW i ): We suggest useful computational shortcuts followed by a method of training SLM parameters from text data. Finally, we introduce more detailed parametrization that involves non-terminal labeling and considerably improves smoothing of SLM statistical parameters. We conclude by presenting certain recognition and perplexity results achieved on standard corpora. 1. INTRODUCTION In the accepted statistical formulation of the speech recognition problem [1] the recognizer seeks to find the word string c W : = arg max W P (AjW)P (W) where A denotes the observab...
On the Use of Grammar Based Language Models for Statistical Machine Translation
- 6th Int. Workshop on Parsing Technologies
, 1999
"... In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target language. The most important demands for the language model in statistical machine translation is the correct word order, given a certain choice of words, and to score the selection of translations, that are done by the translation model Pr(f J 1 je I 1 ), in view of the syntactical context. Beside the inquisition of standard m-grams with long histories, we examined the use of Part-of-Speech based models as well as linguistically motivated grammars with stochastic parsing as a special type of language model. Translation results are given on the Verbmobil task, where translation are performed from German to English, with vocabulary sizes of 6500 and 4000 words respectively. 1 Introduct...
Semantic structured language models
- In: ICSLP
, 2002
"... In this study, we propose two novel semantic language modeling techniques for spoken dialog systems. These methods are called semantic concept based language modeling and semantic structured language modeling. In the concept based language modeling, we propose to use long span semantic units to mode ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this study, we propose two novel semantic language modeling techniques for spoken dialog systems. These methods are called semantic concept based language modeling and semantic structured language modeling. In the concept based language modeling, we propose to use long span semantic units to model meaning sequences in spoken utterances. In the latter technique, we use statistical semantic parsers to extract information from a sentence. This information is then utilized in a maximum entropy based language model. The language models are trained and evaluated in the air travel reservation domain. We obtain improvement over a sophisticated class based N-gram language model both in terms of recognition accuracy and perplexity. Interpolation of the proposed techniques with the class-based N-gram LM provides additional improvement. 1.
Incorporating Linguistic Structure into Statistical Language Models
- In Philosophical Transactions of the Royal Society of London A
, 2000
"... this paper. References ..."
Robust, Finite-State Parsing for Spoken Language Understanding
"... Human understanding of spoken language appears to integrate the use of contextual expectations with acoustic level perception in a tightly-coupled, sequential fashion. Yet computer speech understanding systems typically pass the transcript produced by a speech recognizer into a natural language pars ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Human understanding of spoken language appears to integrate the use of contextual expectations with acoustic level perception in a tightly-coupled, sequential fashion. Yet computer speech understanding systems typically pass the transcript produced by a speech recognizer into a natural language parser with no integration of acoustic and grammatical constraints. One reason for this is the complex- ity of implementing that integration. To ad- dress this issue we have created a robust, semantic parser as a single finite-state machine (FSM). As such, its run-time action is less complex than other robust parsers that are based on either chart or generalized left-right (GLR) architectures. Therefore, we believe it is ultimately more amenable to direct integration with a speech decoder.
Speech Enhanced Multi-span Language Model
"... To capture local and global constraints in a language, statistical n-grams are used in combination with multi-span language models for improved language modelling. Use of latent semantic analysis (LSA) to capture the global semantic constraints and bigram models to capture local constraints, is sho ..."
Abstract
- Add to MetaCart
To capture local and global constraints in a language, statistical n-grams are used in combination with multi-span language models for improved language modelling. Use of latent semantic analysis (LSA) to capture the global semantic constraints and bigram models to capture local constraints, is shown to reduce the perplexity of the model. In this paper we propose a method in which the multi-span LSA language model can be developed based on the speech signal. Reference pattern vectors are derived from the speech signal for each word in the vocabulary. Based on the normalised distance between the reference word pattern vector and the pattern vector of a word in the training data, the LSA model is developed. We show that this model in combination with a standard bigram model performs better than the conventional bigram + LSA model. The results are demonstrated for a limited vocabulary on a database for the Indian language, Tamil. 1.

