Results 1 - 10
of
12
Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures
- Proc. HLT-NAACL 2003
, 2003
"... Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger perfor ..."
Abstract
-
Cited by 36 (8 self)
- Add to MetaCart
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams.
A unified context-free grammar and n-gram model for spoken language processing
- in International Conference of Acoustics, Speech, and Signal Processing
, 2000
"... While context-free grammars (CFGs) remain as one of the most important formalisms for interpreting natural language, word ngram models are surprisingly powerful for domain-independent applications. We propose to unify these two formalisms for both speech recognition and spoken language understanding ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
While context-free grammars (CFGs) remain as one of the most important formalisms for interpreting natural language, word ngram models are surprisingly powerful for domain-independent applications. We propose to unify these two formalisms for both speech recognition and spoken language understanding (SLU). With portability as the major problem, we incorporated domainspecific CFGs into a domain-independent n-gram model that can improve generalizability of the CFG and specificity of the n-gram. In our experiments, the unified model can significantly reduce the test set perplexity from 378 to 90 in comparison with a domainindependent word trigram. The unified model converges well when the domain-specific data becomes available. The perplexity can be further reduced from 90 to 65 with a limited amount of domain-specific data. While we have demonstrated excellent portability, the full potential of our approach lies in its unified recognition and understanding that we are investigating. 1.
Combining Nonlocal, Syntactic And N-Gram Dependencies In Language Modeling
- Proceedings of Eurospeech'99, vol
, 1999
"... A new language model is presented which incorporates local N-gram dependencies with two important sources of long-range dependencies: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy method. Substantial improvements are d ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
A new language model is presented which incorporates local N-gram dependencies with two important sources of long-range dependencies: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy method. Substantial improvements are demonstrated over a trigram model in both perplexity and speech recognition accuracy on the Switchboard task. It is shown that topic dependencies are most useful in predicting words which are semantically related by the subject matter of the conversation. Syntactic dependencies on the other hand are found to be most helpful in positions where the best predictors of the following word are not within N-gram range due to an intervening phrase or clause. It is also shown that these two methods individually enhance an N-gram model in complementary ways and the overall improvement from their combination is nearly additive. 1. INTRODUCTION N-gram models have been widely used as statistical models ...
Topic modeling in fringe word prediction for aac
- In IUI
, 2006
"... Word prediction can be used for enhancing the communication ability of persons with speech and language impairments. In this work, we explore two methods of adapting a language model to the topic of conversation, and apply these methods to the prediction of fringe words. Keywords Word prediction, ke ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Word prediction can be used for enhancing the communication ability of persons with speech and language impairments. In this work, we explore two methods of adapting a language model to the topic of conversation, and apply these methods to the prediction of fringe words. Keywords Word prediction, keystroke savings, alternative and augmentative communication (AAC), topic modeling, language modeling 1.
Experiments on Domain Adaptation for English—Hindi SMT *
"... Abstract. Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text and monolingual target language text. If a significant amount of out-of-domain data is added to the training data, the quality of translation can drop. On the other hand, training an SMT sy ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text and monolingual target language text. If a significant amount of out-of-domain data is added to the training data, the quality of translation can drop. On the other hand, training an SMT system on a small amount of training material for given indomain data leads to narrow lexical coverage which again results in a low translation quality. In this paper, (i) we explore domain-adaptation techniques to combine large out-of-domain training data with small-scale in-domain training data for English—Hindi statistical machine translation and (ii) we cluster large out-of-domain training data to extract sentences similar to in-domain sentences and apply adaptation techniques to combine clustered sub-corpora with in-domain training data into a unified framework, achieving a 0.44 absolute corresponding to a 4.03 % relative improvement in terms of BLEU over the baseline.
Class-dependent Interpolation for Estimating Language Models from Multiple Text Sources
, 2003
"... Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger perf ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams.
Topic Modeling in Word Prediction for AAC
, 2005
"... Word prediction is a method for enhancing the communication ability of persons with speech and language impairments. In this work, we explore one method of adjusting the language model based on the content of a conversation. Keywords Word prediction, keystroke savings, alternative and augmentative c ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Word prediction is a method for enhancing the communication ability of persons with speech and language impairments. In this work, we explore one method of adjusting the language model based on the content of a conversation. Keywords Word prediction, keystroke savings, alternative and augmentative communication (AAC), topic modeling, language modeling
Topic-Based Language Modeling with Dynamic Bayesian Networks
"... Although n-gram models are still the de facto standard in language modeling for speech recognition, more sophisticated models achieve better accuracy by taking additional information, such as syntactic rules, semantic relations or domain knowledge into account. Unfortunately, most of the effort in d ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Although n-gram models are still the de facto standard in language modeling for speech recognition, more sophisticated models achieve better accuracy by taking additional information, such as syntactic rules, semantic relations or domain knowledge into account. Unfortunately, most of the effort in developing such models goes into the implementation of handcrafted inference routines. A generic mechanism to introduce background knowledge into a language model is lacking. We propose using dynamic Bayesian networks. Dynamic Bayesian networks are a generalization of the n-gram models and HMMs traditionally used in language modeling and speech recognition. Whereas those models use a single random variable to represent state, Bayesian networks can have any number of variables. As such they are particularly well-suited for the construction of models that take additional information into account. This paper discusses language modeling with Bayesian networks. Examples of Bayesian network implementations of wellknown language models are given and a novel topic-based language model is presented. Index Terms: language modeling, dynamic Bayesian networks. 1.
Language Modeling for Dialog System
, 2000
"... take two forms. Human input can be constrained through a directed dialog, allowing the decoder to use a state-specific language model to improve recognition accuracy. Mixedinitiative systems allow for human input that while domainspecific might not be state-specific. Nevertheless, for the most part ..."
Abstract
- Add to MetaCart
take two forms. Human input can be constrained through a directed dialog, allowing the decoder to use a state-specific language model to improve recognition accuracy. Mixedinitiative systems allow for human input that while domainspecific might not be state-specific. Nevertheless, for the most part human input to a mixed-initiative system is predictable, particularly when given information about the immediately preceding system prompt. The work reported in this paper addresses the problem of balancing state-specific and general language modeling in a mixed-initiative dialog system. By incorporating dialog state adaptation of the language model, we have reduced the recognition error rate by 11.5%.
Adapting Word Prediction to Subject Matter without Topic-labeled Data
"... Word prediction helps to increase communication rate when using Augmentative and Alternative Communication devices. Basic prediction systems offer topically inappropriate predictions for the context, thus we adapt the predictions to the topic of discourse. However, previous work has relied on texts ..."
Abstract
- Add to MetaCart
Word prediction helps to increase communication rate when using Augmentative and Alternative Communication devices. Basic prediction systems offer topically inappropriate predictions for the context, thus we adapt the predictions to the topic of discourse. However, previous work has relied on texts that are grouped into topics by humans. In contrast, we avoid this restriction by treating each document as a topic. The results are comparable to human-labeled topics and also the method is applicable to unlabeled text.

