Results 1 - 10
of
11
Support vector machines for speech recognition
- Proceedings of the International Conference on Spoken Language Processing
, 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and over-parameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.
Modeling Out-Of-Vocabulary Words For Robust Speech Recognition
, 2000
"... This thesis concerns the problem of unknown or out-of-vocabulary (00V) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognize ..."
Abstract
-
Cited by 43 (5 self)
- Add to MetaCart
This thesis concerns the problem of unknown or out-of-vocabulary (00V) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognizer erroneously substitutes the OOV word with a similarly sounding word from its vocabulary. Furthermore, a recognition error due to an OOV word tends to spread errors into neighboring words; dramatically degrading overall recognition performance.
The Erlangen Spoken Dialogue System EVAR: A State-of-the-Art Information Retrieval System
, 1998
"... In this paper, we present an overview of the spoken dialogue system EVAR that was developed at the University of Erlangen. In January 1994, it became accessible over telephone line and could answer inquiries in the German language about German InterCity train connections. It has since been continuou ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
In this paper, we present an overview of the spoken dialogue system EVAR that was developed at the University of Erlangen. In January 1994, it became accessible over telephone line and could answer inquiries in the German language about German InterCity train connections. It has since been continuously improved and extended, including some unique features, such as the processing of out--of--vocabulary words and a flexible dialogue strategy that adapts to the quality of the recognition of the user input. In fact, several different versions of the system have emerged, i.e. a subway information system, train and flight information systems in different languages, and an integrated multilingual and multifunctional system which covers German and 3 additional languages in parallel. Current research focuses on the introduction of stochastic models into the semantic analysis, on the direct integration of prosodic information into the word recognition process, on the detection of user emotion, and on multilinguality and multifunctionality.
A Multi-Class Approach For Modelling Out-Of-Vocabulary Words
, 2002
"... In this paper we present a multi-class extension to our approach for modelling out-of-vocabulary (OOV) words [1]. Instead of augmenting the word search space with a single OOV model, we add several OOV models, one for each class of words. We present two approaches for designing the OOV word classes. ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
In this paper we present a multi-class extension to our approach for modelling out-of-vocabulary (OOV) words [1]. Instead of augmenting the word search space with a single OOV model, we add several OOV models, one for each class of words. We present two approaches for designing the OOV word classes. The first approach relies on using common part-of-speech tags. The second approach is a data-driven two-step clustering procedure, where the first step uses agglomerative clustering to derive an initial class assignment, while the second step uses iterative clustering to move words from one class to another in order to reduce the model perplexity. We present experiments within the JUPITER weather information domain. Results show that the multi-class model significantly improves performance over using a single OOV class. For an OOV detection rate of 70%, the false alarm rate is reduced from 5.3% for a single class to 2.9% for an eight-class model.
Semantic Processing Of Out-Of-Vocabulary Words In A Spoken Dialogue System
, 1997
"... One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system's vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classification and processing of OOV words i ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system's vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classification and processing of OOV words in an automatic train timetable information system [2]. The various extensions that had to be effected on the different modules of the system are reported, resulting in the design of appropriate dialogue strategies, as are encouraging evaluation results on the new versions of the word recogniser and the linguistic processor. 1. INTRODUCTION The majority of speech understanding systems have to face the problem of words that are not covered by their current lexicon, i.e. OOV words. In such a case the word recogniser usually recognises one or more different words with a similar acoustic profile to the unknown. These misrecognitions often result in possibly irreparable misunderstandings between ...
Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages
"... We explore the use of morph-based language models in large-vocabulary continuous speech recognition systems across four so-called “morphologically rich ” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic. The morphs are subword units discovered in an unsupervised, data-driven way ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We explore the use of morph-based language models in large-vocabulary continuous speech recognition systems across four so-called “morphologically rich ” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic. The morphs are subword units discovered in an unsupervised, data-driven way using the Morfessor algorithm. By estimating n-gram language models over sequences of morphs instead of words, the quality of the language model is improved through better vocabulary coverage and reduced data sparsity. Standard word models suffer from high out-of-vocabulary (OOV) rates, whereas the morph models can recognize previously unseen word forms by concatenating morphs. It is shown that the morph models do perform fairly well on OOVs without compromising the recognition accuracy on in-vocabulary words. The Arabic experiment constitutes the only exception, since here the standard word model outperforms the morph model. Differences in the data sets and the amount of data are discussed as a plausible explanation.
Detection and Transcription of OOV Words
, 1998
"... This thesis deals with the problem of Out-Of-Vocabulary words in speech recognition. The standard response of speech recognition systems whenever they encounter such OOV words is to (silently) misrecognize them without issuing any warning to the user. In order to avoid this undesired behaviour, two ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This thesis deals with the problem of Out-Of-Vocabulary words in speech recognition. The standard response of speech recognition systems whenever they encounter such OOV words is to (silently) misrecognize them without issuing any warning to the user. In order to avoid this undesired behaviour, two different strategies are proposed. The first strategy consists in preventing the problem, i.e. the occurrence of OOV words, and this thesis presents two ways of doing that. First, the system vocabulary is optimized using information extracted from other corpora and application domains, such that the number of expected OOV words be minimized. Using this method, the vocabulary coverage was significantly improved, especially for small vocabularies. The second method of reducing the number of OOV words consists of redefining the concept of "word" based on morphological considerations. In particular, compound words are decomposed into their constituent parts, which are used as the lexical recogni...
Research Issues for the Next Generation Spoken Dialogue Systems Revisited
, 2001
"... In this paper we take a second look at current research issues for conversational dialogue systems addressed in [17]. We look at two systems, a movie information and a stock information system which were built based on the experiences with the train information system Evar, described in [17]. ..."
Abstract
- Add to MetaCart
In this paper we take a second look at current research issues for conversational dialogue systems addressed in [17]. We look at two systems, a movie information and a stock information system which were built based on the experiences with the train information system Evar, described in [17].
Towards a Robust Grammar-based Dialogue System
, 2005
"... Grammar-based dialogue systems provide many advantages over keyword-based approaches, not least a degree of domain independence. Systems that use grammars to build semantic-pragmatic interpretations of utterances and use these to maintain and update a dialogue information state are becoming widespre ..."
Abstract
- Add to MetaCart
Grammar-based dialogue systems provide many advantages over keyword-based approaches, not least a degree of domain independence. Systems that use grammars to build semantic-pragmatic interpretations of utterances and use these to maintain and update a dialogue information state are becoming widespread, at least in the research domain. A disadvantage of such systems is their lack of robustness in the face of imperfect communication between user and system. In this
A Study of the Use and Evaluation of Confidence . . .
, 1998
"... Confidence measures have been found to be useful for a number tasks within the field of Automatic Speech Recognition (ASR). For example, the use of confidence measures has been reported in the utterance verification, keyword spotting and Out-of-Vocabulary (OOV) word spotting literature. In this repo ..."
Abstract
- Add to MetaCart
Confidence measures have been found to be useful for a number tasks within the field of Automatic Speech Recognition (ASR). For example, the use of confidence measures has been reported in the utterance verification, keyword spotting and Out-of-Vocabulary (OOV) word spotting literature. In this report, it is shown that so called 'hybrid Artificial Neural Network/Hidden Markov Model' (HMM/ANN) systems are well suited to the task of generating confidence measures, due to their ability to provide local phone class posterior probability estimates which may be used to generate confidence measures in a computationally efficient manner. A number of evaluation metrics are also described and the performance of five confidence measures derived from the ABBOT hybrid HMM/ANN system for the tasks of utterance verification and OOV word spotting are evaluated using these metrics. Besides the tasks described above, confidence measures may also be used for tasks such as filtering the acoustics for a nu...

