Results 1 -
4 of
4
A voice-controlled automatic telephone switchboard and directory information system
- Speech Communication
, 1997
"... The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room n ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room numbers as well as direct call completion to a desired party. In this paper, we present the underlying probabilistic framework, the system architecture, and the individual modules for speech recognition, language understanding, dialogue control, and speech output. In addition, we report results on performance and user behaviour obtained from a field test in our research lab with a 600-entry database. We derive a new maximum-a-posteriori decision rule which incorporates database knowledge and dialogue history as constraints in speech recognition and language understanding. It has improved speech understanding accuracy by 19 % (in terms of concept error rate), and reduced attribute substitution errors (e.g. recognition of a wrong name) by 38%. The decision rule is implemented in a multi-stage approach as a combination of state-of-the-art speech recognition, partial parsing with an attributed stochastic context-free grammar, and an N-best algorithm which is also described in this paper. The system conducts a flexible mixed-initiative dialogue rather than using a rigid form-filling scheme, and incorporates database knowledge to optimize the dialogue flow.
Deleted Interpolation And Density Sharing For Continuous Hidden Markov Models
- In Proc. ICASSP, Atlanta
, 1996
"... As one of the most powerful smoothing techniques, deleted interpolation has been widely used in both discrete and semi-continuous hidden Markov model (HMM) based speech recognition systems. For continuous HMMs, most smoothing techniques are carried out on the parameters themselves such as Gaussian m ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
As one of the most powerful smoothing techniques, deleted interpolation has been widely used in both discrete and semi-continuous hidden Markov model (HMM) based speech recognition systems. For continuous HMMs, most smoothing techniques are carried out on the parameters themselves such as Gaussian mean or covariance parameters. In this paper, we propose to smooth the probability density values instead of the parameters of continuous HMMs. This allows us to use most of the existing smoothing techniques for both discrete and continuous HMMs. We also point out that our deleted interpolation can be regarded as a parameter sharing technique. We further generalize this sharing to the probability density function (PDF) level, in which each PDF becomes a basic unit and can be freely shared across any Markov state. For a wide range of dictation experiments, deleted interpolation reduced the word error rate by 11% to 23% over other simple parameter smoothing techniques like flooring. Generic PD...
A Bottom-Up Approach For Handling Unseen Triphones In Large Vocabulary Continuous Speech Recognition
- CLEO collaboration), Cornell preprint CLNS 94/1306, CLEO 94/24
, 1996
"... This paper presents an extension of bottom-up state-tying towards improved handling of unseen triphones. As opposed to the usual backing-o# to diphones and monophones, the current method aims at #nding a triphone model that has proven to exhibit some similarity with the unseen triphone. It is based ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents an extension of bottom-up state-tying towards improved handling of unseen triphones. As opposed to the usual backing-o# to diphones and monophones, the current method aims at #nding a triphone model that has proven to exhibit some similarity with the unseen triphone. It is based on a probabilistic mapping of unseen contexts to clusters of triphone-states observed in the training data.
Substate Tying With Combined Parameter Training and Reduction in Tied-Mixture HMM Design
- in Tied-Mixture HMM Design, in ‘Transactions On Speech and Audio Processing
, 2002
"... Two approaches are proposed for the design of tied-mixture hidden Markov models (TMHMM). One approach improves parameter sharing via partial tying of TMHMM states. To facilitate tying at the substate level, the state emission probabilities are constructed in two stages or, equivalently, are viewed a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Two approaches are proposed for the design of tied-mixture hidden Markov models (TMHMM). One approach improves parameter sharing via partial tying of TMHMM states. To facilitate tying at the substate level, the state emission probabilities are constructed in two stages or, equivalently, are viewed as a "mixture of mixtures of Gaussians." This paradigm allows, and is complemented with, an optimization technique to seek the best complexity-accuracy tradeoff solution, which jointly exploits Gaussian density sharing and substate tying. Another approach to enhance model training is combined training and reduction of model parameters. The procedure starts by training a system with a large universal codebook of Gaussian densities. It then iteratively reduces the size of both the codebook and the mixing coefficient matrix, followed by parameter re-training. The additional cost in design complexity is modest. Experimental results on the ISOLET database and its E-set subset show that substate tying reduces the classification error rate by over 15%, compared to standard Gaussian sharing and whole-state tying. TMHMM design with combined training and reduction of parameters reduces the classification error rate by over 20% compared to conventional TMHMM design. When the two proposed approaches were integrated, 25% error rate reduction over TMHMM with whole-state tying was achieved.

