Results 1 -
5 of
5
From HMM's to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition
, 1996
"... ..."
Lattice-Based Search Strategies For Large Vocabulary Speech Recognition
, 1995
"... The design of search algorithms is an important issue in recognition, particularly for very large vocabulary, continuous speech. It is an especially crucial problem when computationally expensive knowledge sources are used in the system, as is necessary to achieve high accuracy. Recently, multi-pass ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The design of search algorithms is an important issue in recognition, particularly for very large vocabulary, continuous speech. It is an especially crucial problem when computationally expensive knowledge sources are used in the system, as is necessary to achieve high accuracy. Recently, multi-pass search strategies have been used as a means of applying inexpensive knowledge sources early on to prune the search space for subsequent passes using more expensive knowledge sources. Three multi-pass search algorithms are investigated in this thesis work: the N-best search algorithm, a lattice dynamic programming search algorithm and a lattice local search algorithm. Both the lattice dynamic programming and lattice local search algorithms are shown to achieve comparable performance to the N-best search algorithm while running as much as 10 times faster on a 20,000 word vocabulary task. The lattice local search algorithm is also shown to have the additional advantage over the lattice dynamic programming search algorithm of allowing sentence-level knowledge sources to be incorporated into the search.
Articulatory Methods for Speech Production and Recognition
, 1996
"... roduction-based knowledge into the recognition framework. By using an explicit time-domain articulatory model of the mechanisms of co-articulation, it is hoped to obtain a more accurate model of contextual effects in the acoustic signal, while using fewer parameters than traditional acoustically-dri ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
roduction-based knowledge into the recognition framework. By using an explicit time-domain articulatory model of the mechanisms of co-articulation, it is hoped to obtain a more accurate model of contextual effects in the acoustic signal, while using fewer parameters than traditional acoustically-driven approaches. Separate articulatory and acoustic models are provided, and in each case the parameters of the models are automatically optimised over a training data set. A predictive statistically-based model of co-articulation is described, and found to yield improved articulatory modelling accuracy compared with X-ray articulatory traces. Parameterised acoustic vectors are synthesised by a set of artificial neural networks, and the resulting acoustic representations are used to re-score N-best recognition hypothesis lists produced by an HMM-based recogniser. The system is evaluated on two test databases, one including speaker-specific X-ray training data and the other aco
Continuous Word Recognition Based on the Stochastic Segment Model
- Proc. DARPA Workshop CSR
, 1992
"... This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize vari ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize variable length segments; a divisive clustering technique for estimating robust context-dependent models; and recognition using the N-best rescoring formalism, which also provides a mechanism for combining different knowledge sources (e.g. SSM and HMM scores). Results are reported for the speaker-independent portion of the Resource Management Corpus, for both the SSM system and a combined BU-SSM/BBN-HMM system. 1. INTRODUCTION In the last decade, most of the research on continuous speech recognition has focused on different variations of hidden Markov models (HMMs), and the various efforts have led to significant improvements in recognition performance. However, some researchers have begun to ...
Robust Estimation of Stocchastic Segment Models for Word Recognition
, 1990
"... In this work, we develop robust estimation techniques for a continuous-word recognition system using the Stochastic Segment model (SSM). This work is done under the N-best rescoring formalism, where a less complex system than the SSM is used to generate candidate hypotheses which are then rescored a ..."
Abstract
- Add to MetaCart
In this work, we develop robust estimation techniques for a continuous-word recognition system using the Stochastic Segment model (SSM). This work is done under the N-best rescoring formalism, where a less complex system than the SSM is used to generate candidate hypotheses which are then rescored and reranked by the SSM. Components of the system that are the focus of this work include estimation of weights for score combination and robust parameter estimation using clustering techniques to model context. In particular, we develop several agglomerative and divisive clustering techniques for multivariate Gaussian distributions, which we use to cluster triphone models. This leads to better estimates with fewer parameters resulting in reduction in word error and storage/computation costs over using unclustered triphones. We also implement an SSM system based on microsegments which combines mixture modeling with trajectory modeling and examine the tradeoffs involved between the allocation ...

