Results 1 -
7 of
7
A Maximum Entropy Language Model Integrating N-Grams And Topic Dependencies For Conversational Speech Recognition
- Proceedings of ICASSP'99
, 1999
"... A compact language model which incorporates local dependencies in the form of N-grams and long distance dependencies through dynamic topic conditional constraints is presented. These constraints are integrated using the maximum entropy principle. Issues in assigning a topic to a test utterance are i ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
A compact language model which incorporates local dependencies in the form of N-grams and long distance dependencies through dynamic topic conditional constraints is presented. These constraints are integrated using the maximum entropy principle. Issues in assigning a topic to a test utterance are investigated. Recognition results on the Switchboard corpus are presented showing that with a very small increase in the number of model parameters, reduction in word error rate and language model perplexity are achieved over trigram models. Some analysis follows, demonstrating that the gains are even larger on content-bearing words. The results are compared with those obtained by interpolating topicindependent and topic-specific N-gram models. The framework presented here extends easily to incorporate other forms of statistical dependencies such as syntactic word-pair relationships or hierarchical topic constraints. 1. INTRODUCTION Language modeling is a crucial component of systems that c...
Combining Nonlocal, Syntactic And N-Gram Dependencies In Language Modeling
- Proceedings of Eurospeech'99, vol
, 1999
"... A new language model is presented which incorporates local N-gram dependencies with two important sources of long-range dependencies: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy method. Substantial improvements are d ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
A new language model is presented which incorporates local N-gram dependencies with two important sources of long-range dependencies: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy method. Substantial improvements are demonstrated over a trigram model in both perplexity and speech recognition accuracy on the Switchboard task. It is shown that topic dependencies are most useful in predicting words which are semantically related by the subject matter of the conversation. Syntactic dependencies on the other hand are found to be most helpful in positions where the best predictors of the following word are not within N-gram range due to an intervening phrase or clause. It is also shown that these two methods individually enhance an N-gram model in complementary ways and the overall improvement from their combination is nearly additive. 1. INTRODUCTION N-gram models have been widely used as statistical models ...
Efficient Training Methods For Maximum Entropy Language Modeling
- IN PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TECHNOLOGIES (ICSLP-00
, 2000
"... Maximum entropy language modeling techniques combine different sources of statistical dependence, such as syntactic relationships, topic cohesiveness and collocation frequency, in a unified and effective language model. These techniques however are also computationally very intensive, particularly d ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Maximum entropy language modeling techniques combine different sources of statistical dependence, such as syntactic relationships, topic cohesiveness and collocation frequency, in a unified and effective language model. These techniques however are also computationally very intensive, particularly during model estimation, compared to the more prevalent alternative of interpolating several simple models, each capturing one type of dependency. In this paper we present ways which significantly reduce this complexity by reorganizing the required computations. We show that in case of a model with N-gram constraints, each iteration of the parameter estimation algorithm requires the same amount of computation as estimating a comparable back-off N-gram model. In general, the computational cost of each iteration in model estimation is linear in the number of distinct "histories" seen in the training corpus, times a model-class dependent factor. The reorganization focuses mainly on reducing this...
Maximum Entropy Language Modeling with Non-Local Dependencies -- Dissertation Proposal
, 2000
"... ..."
Building A Topic-Dependent Maximum Entropy Model For Very Large Corpora
- In Proceedings of ICASSP2002
, 1217
"... Maximum entropy (ME) techniques have been successfully used to combine different sources of linguistically meaningful constraints in language models. However, most of the current ME models can only be used for small corpora, since the computational load in training ME models for large corpora is unb ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Maximum entropy (ME) techniques have been successfully used to combine different sources of linguistically meaningful constraints in language models. However, most of the current ME models can only be used for small corpora, since the computational load in training ME models for large corpora is unbearable. This problem is especially severe when non-local dependencies are considered. In this paper, we show how to train and use topic-dependent ME models efficiently for a very large corpus, Broadcast News (BN). The training time is greatly reduced by hierarchical training and divide-and-conquer approaches. The computation in using the model is also simplified by pre-normalizing the denominators of the ME model. We report new speech recognition results showing improvement with the topic model relative to the standard N-gram model for the Broadcast News task.
Class-dependent Interpolation for Estimating Language Models from Multiple Text Sources
, 2003
"... Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger perf ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams.
Large Vocabulary Statistical Language Modeling for Continuous Speech
- In Proceedings of the 7th European Conference on Speech Communication and Technology
, 2001
"... Statistical language modeling (SLM) is an essential part in any large-vocabulary continuous speech recognition (LVCSR) system. The development of the standard SLM methods has been strongly affected by the goals of LVCSR in English. The structure of Finnish is substantially different from English, so ..."
Abstract
- Add to MetaCart
Statistical language modeling (SLM) is an essential part in any large-vocabulary continuous speech recognition (LVCSR) system. The development of the standard SLM methods has been strongly affected by the goals of LVCSR in English. The structure of Finnish is substantially different from English, so if the standard SLMs are directly applied, the success is by no means granted. In this paper we describe our first attempts of building a LVCSR for Finnish and the new SLMs that we have tried. One of our objective has been the indexing and recognition of broadcast news, so special issues of our interest are topic detection, word stemming and modeling words that are poorly covered in the training data. Our new methods are based on neural computing using the self-organizing map (SOM) which has recently been shown to successfully extract and approximate latent semantic structures from massive text collections.

