Results 11 - 20
of
21
Using Perfect Sampling in Parameter Estimation of a Whole Sentence Maximum Entropy Language Model
, 2000
"... The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Mod ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Model (WSME) could be used. Until now MonteCarlo Markov Chains (MCMC) sampling techniques has been used to estimate the paramenters of the WSME model. In this paper, we propose the application of another sampling technique: the Perfect Sampling (PS). The experiment has shown a reduction of 30% in the perplexity of the WSME model over the trigram model and a reduc- tion of 2% over the WSME model trained with MCMC.
Topic-Based Language Modeling with Dynamic Bayesian Networks
"... Although n-gram models are still the de facto standard in language modeling for speech recognition, more sophisticated models achieve better accuracy by taking additional information, such as syntactic rules, semantic relations or domain knowledge into account. Unfortunately, most of the effort in d ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Although n-gram models are still the de facto standard in language modeling for speech recognition, more sophisticated models achieve better accuracy by taking additional information, such as syntactic rules, semantic relations or domain knowledge into account. Unfortunately, most of the effort in developing such models goes into the implementation of handcrafted inference routines. A generic mechanism to introduce background knowledge into a language model is lacking. We propose using dynamic Bayesian networks. Dynamic Bayesian networks are a generalization of the n-gram models and HMMs traditionally used in language modeling and speech recognition. Whereas those models use a single random variable to represent state, Bayesian networks can have any number of variables. As such they are particularly well-suited for the construction of models that take additional information into account. This paper discusses language modeling with Bayesian networks. Examples of Bayesian network implementations of wellknown language models are given and a novel topic-based language model is presented. Index Terms: language modeling, dynamic Bayesian networks. 1.
Email classification for automated service handling
"... We describe the experience and lessons learned from developing a range of electronic services for a specialist engineering company. We are using a custom workflow management system as the base for a range of services which are offered via a multimodal portal, using a language-based approach to extra ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe the experience and lessons learned from developing a range of electronic services for a specialist engineering company. We are using a custom workflow management system as the base for a range of services which are offered via a multimodal portal, using a language-based approach to extracting information from HTML forms, email, and SMS. We describe the email classification experiments we have carried out and discuss the development of customer services based on automatic email classification.
Improvement of a Whole Sentence Maximum Entropy Language Model Using Grammatical Features
, 2001
"... In this paper, we propose adding long-term grammatical information in a Whole Sentence Maximun Entropy Language Model (WSME) in order to improve the performance of the model. The grammatical information was added to the WSME model as features and were obtained from a Stochastic Context-Free ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we propose adding long-term grammatical information in a Whole Sentence Maximun Entropy Language Model (WSME) in order to improve the performance of the model. The grammatical information was added to the WSME model as features and were obtained from a Stochastic Context-Free grammar. Finally, experiments using a part of the Penn Treebank corpus were carried out and significant improvements were acheived.
LANGUAGE MODEL ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION AND STATISTICAL MACHINE TRANSLATION
, 2004
"... Language modeling is critical and indispensable for many natural language ap-plications such as automatic speech recognition and machine translation. Due to the complexity of natural language grammars, it is almost impossible to construct language models by a set of linguistic rules; therefore stati ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Language modeling is critical and indispensable for many natural language ap-plications such as automatic speech recognition and machine translation. Due to the complexity of natural language grammars, it is almost impossible to construct language models by a set of linguistic rules; therefore statistical techniques have been dominant for language modeling over the last few decades. All statistical modeling techniques, in principle, work under some conditions: 1) a reasonable amount of training data is available and 2) the training data comes from the same population as the test data to which we want to apply our model. Based on observations from the training data, we build statistical models and therefore, the success of a statistical model is crucially dependent on the training data. In other words, if we don’t have enough data for training, or the training data is not matched with the test data, we are not able to build accurate statistical models. This thesis presents novel methods to cope with those problems in language modeling—language model adaptation.
Language Model Adaptation
, 2000
"... .15> attempt to exploit longer distance dependencies. -- infer some notion of `topic' from text. -- compute topic dependent probability. 8th ELSNET summer school 2 Language Model Adaptation 26 July 2000 ' & $ % Adaptive Language Modelling Stage 1: automatic derivation of topic information from ..."
Abstract
- Add to MetaCart
.15> attempt to exploit longer distance dependencies. -- infer some notion of `topic' from text. -- compute topic dependent probability. 8th ELSNET summer school 2 Language Model Adaptation 26 July 2000 ' & $ % Adaptive Language Modelling Stage 1: automatic derivation of topic information from text. ffl loose definition of document: a unit of spoken (or written) data of a certain length that contains some topic(s), or content(s). ffl topic of a document (= long distance or document-wide statistics. ffl information retrieval (IR): `bag-of-words' model based on a histogram of weighted unigram frequencies. Stage 2: combination of global and topic-dependent text statistics. ffl mixture. ffl maximum entropy modelling. (ref) Jelinek (1997). ' & $ % Mixtur
Use Of Non-Negative Matrix Factorization For Language Model
- In Proc. of ICASSP
, 2001
"... This paper introduces the Non-negative Matrix Factorization for Language Model adaptation. This approach is an alternative to Latent Semantic Analysis based Language Modeling using Singular Value Decomposition (SVD) with several benefits. A new method, which does not require an explicit document seg ..."
Abstract
- Add to MetaCart
This paper introduces the Non-negative Matrix Factorization for Language Model adaptation. This approach is an alternative to Latent Semantic Analysis based Language Modeling using Singular Value Decomposition (SVD) with several benefits. A new method, which does not require an explicit document segmentation of the training corpus is presented as well. This method resulted in a perplexity reduction of 16% on a database of biology lecture transcriptions.
The Use Of Confidence Measures In Vector Based Call-Routing
- In Proc. 8th European Conf. on Speech Communication and Technology
, 2003
"... In previous work, we experimented with different techniques of vector-based call routing, using the transcriptions of the queries to compare algorithms. In this paper, we base the routing decisions on the recogniser output rather than transcriptions and examine the use of confidence measures (CMs) t ..."
Abstract
- Add to MetaCart
In previous work, we experimented with different techniques of vector-based call routing, using the transcriptions of the queries to compare algorithms. In this paper, we base the routing decisions on the recogniser output rather than transcriptions and examine the use of confidence measures (CMs) to combat the problems caused by the "noise" in the recogniser output. CMs are derived for both the words output from the recogniser and for the routings themselves and are used to investigate improving both routing accuracy and routing confidence. Results are given for a 35 route retail store enquiry-point task. They suggest that although routing error is controlled by the recogniser errorrate, confidence in routing decisions can be improved using these techniques.
LSA-based Language Model Adaptation for Highly Inflected Languages
"... This paper presents a language model topic adaptation framework for highly inflected languages. In such languages, subword units are used as basic units for language modeling. Since such units carry little semantic information, they are not very suitable for topic adaptation. We propose to lemmatize ..."
Abstract
- Add to MetaCart
This paper presents a language model topic adaptation framework for highly inflected languages. In such languages, subword units are used as basic units for language modeling. Since such units carry little semantic information, they are not very suitable for topic adaptation. We propose to lemmatize the corpus of training documents before constructing a latent topic model. To adapt language model, we use few lemmatized training sentences to find a set of documents that are semantically close to the current document. Fast marginal adaptation of subword trigram language model is used for adapting the background model. Experiments on a set of Estonian test texts show that the proposed approach gives a 19 % decrease in language model perplexity. A statistically significant decrease in perplexity is observed already when using just two sentences for adaptation. We also show that the model employing lemmatization gives consistently better results than the unlemmatized model. Index Terms: speech recognition, language model adaptation, LSA, inflected languages

