Results 1 -
4 of
4
Full Expansion Of Context-Dependent Networks In Large Vocabulary Speech Recognition
- Proceedings of ICASSP 98
, 1998
"... We combine our earlier approach to context-dependent network representation with our algorithm for determinizing weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While ..."
Abstract
-
Cited by 32 (12 self)
- Add to MetaCart
We combine our earlier approach to context-dependent network representation with our algorithm for determinizing weighted networks to build optimized networks for large-vocabulary speech recognition combining an n-gram language model, a pronunciation dictionary and context-dependency modeling. While fullyexpanded networks have been used before in restrictive settings (medium vocabulary or no cross-word contexts), we demonstrate that our network determinization method makes it practical to use fully-expanded networks also in large-vocabulary recognition with full cross-word context modeling. For the DARPA North American Business News task (NAB), we give network sizes and recognition speeds and accuracies using bigram and trigram grammars with vocabulary sizes ranging from 10,000 to 160,000 words. With our construction, the fully-expanded NAB context-dependent networks contain only about twice as many arcs as the corresponding language models. Interestingly, we also find that, with these...
Network Optimizations for Large Vocabulary Speech Recognition
- Speech Communication
, 1998
"... The redundancy and the size of networks in large-vocabulary speech recognition systems can have a critical effect on their overall performance. We describe the use of two new algorithms: weighted determinization and minimization [12]. These algorithms transform recognition labeled networks into equi ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
The redundancy and the size of networks in large-vocabulary speech recognition systems can have a critical effect on their overall performance. We describe the use of two new algorithms: weighted determinization and minimization [12]. These algorithms transform recognition labeled networks into equivalent ones that require much less time and space in large-vocabulary speech recognition. They are both optimal: weighted determinization eliminates the number of alternatives at each state to the minimum, and weighted minimization reduces the size of deterministic networks to the smallest possible number of states and transitions. These algorithms generalize classical automata determinization and minimization to deal properly with the probabilities of alternative hypotheses and with the relationships between units (distributions, phones, words) at different levels in the recognition system. We illustrate their use in several applications, and report the results of our experiments. Key words...
Transducer Composition for Context-Dependent Network Expansion
- In Proceedings of Eurospeech'97. Rhodes
, 1997
"... Context-dependent models for language units are essential in highaccuracy speech recognition. However, standard speech recognition frameworks are based on the substitution of lower-level models for higher-level units. Since substitution cannot express context-dependency constraints, actual recog ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Context-dependent models for language units are essential in highaccuracy speech recognition. However, standard speech recognition frameworks are based on the substitution of lower-level models for higher-level units. Since substitution cannot express context-dependency constraints, actual recognizers use restrictive model-structure assumptions and specialized code for context-dependent models, leading to decreased flexibility and lost opportunities for automatic model optimization. Instead, we propose a recognition framework that builds in the possibility of context dependency from the start by using weighted finite-state transduction rather than substitution. The framework is implemented with a general demand-driven transducer composition algorithm that allows great flexibility in model structure, form of context dependency and network expansion method, while achieving competitive recognition performance. 1 Introduction 1.1 The Substitution Architecture In the standard...
Predictive hidden Markov model selection for speech recognition
- IEEE Trans. Speech Audio Process
, 2005
"... Abstract—This paper surveys a series of model selection approaches and presents a novel predictive information criterion (PIC) for hidden Markov model (HMM) selection. The approximate Bayesian using Viterbi approach is applied for PIC selection of the best HMMs providing the largest prediction infor ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—This paper surveys a series of model selection approaches and presents a novel predictive information criterion (PIC) for hidden Markov model (HMM) selection. The approximate Bayesian using Viterbi approach is applied for PIC selection of the best HMMs providing the largest prediction information for generalization of future data. When the perturbation of HMM parameters is expressed by a product of conjugate prior densities, the segmental prediction information is derived at the frame level without Laplacian integral approximation. In particular, a multivariate distribution is attained to characterize the prediction information corresponding to HMM mean vector and precision matrix. When performing model selection in tree structure HMMs, we develop a top-down prior/posterior propagation algorithm for estimation of structural hyperparameters. The prediction information is determined so as to choose the best HMM tree model. Different from maximum likelihood (ML) and minimum description length (MDL) selection criteria, the parameters of PIC chosen HMMs are computed via maximum a posteriori estimation. In the evaluation of continuous speech recognition using decision tree HMMs, the PIC criterion outperforms ML and MDL criteria in building a compact tree structure with moderate tree size and higher recognition rate. Index Terms—Approximate Bayesian, decision tree state tying, model selection, multivariate distribution, predictive information criterion, prior/posterior propagation, speech recognition. I.

