Results 1 -
3 of
3
Continuous Word Recognition Based on the Stochastic Segment Model
- Proc. DARPA Workshop CSR
, 1992
"... This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize vari ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper presents an overview of the Boston University continuous word recognition system, which is based on the Stochastic Segment Model (SSM). The key components of the system described here include: a segment-based acoustic model that uses a family of Gaussian distributions to characterize variable length segments; a divisive clustering technique for estimating robust context-dependent models; and recognition using the N-best rescoring formalism, which also provides a mechanism for combining different knowledge sources (e.g. SSM and HMM scores). Results are reported for the speaker-independent portion of the Resource Management Corpus, for both the SSM system and a combined BU-SSM/BBN-HMM system. 1. INTRODUCTION In the last decade, most of the research on continuous speech recognition has focused on different variations of hidden Markov models (HMMs), and the various efforts have led to significant improvements in recognition performance. However, some researchers have begun to ...
Robust Automatic Speech Recognition With Unreliable Data
, 1999
"... Theoretical and practical issues of some of the problems in robust automatic speech recognition (ASR) and some of the techniques that address them are presented in this report. The problem of the robustness of the ASR in real--life (as opposed to laboratory) conditions is paramount to the widespread ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Theoretical and practical issues of some of the problems in robust automatic speech recognition (ASR) and some of the techniques that address them are presented in this report. The problem of the robustness of the ASR in real--life (as opposed to laboratory) conditions is paramount to the widespread deployment of speech enabled products. The report reviews techniques used so far for robust ASR, ranging from simple spectrum subtraction to various types of model adaptation. A possible connection of robust ASR with the computational auditory scene analysis (CASA), methods for local Signal--to--Noise Ratio (SNR) estimation and classification/scoring with on--line adapted statistical models is discussed. The main focus is on the techniques that would allow for incorporation of CASA and local SNR estimates (used as methods for speech/non--speech separation) into the present prevailing stochastic pattern matching paradigms -- Hidden Markov models (HMM) and artificial neural networks (ANN). Th...
Robust Estimation of Stocchastic Segment Models for Word Recognition
, 1990
"... In this work, we develop robust estimation techniques for a continuous-word recognition system using the Stochastic Segment model (SSM). This work is done under the N-best rescoring formalism, where a less complex system than the SSM is used to generate candidate hypotheses which are then rescored a ..."
Abstract
- Add to MetaCart
In this work, we develop robust estimation techniques for a continuous-word recognition system using the Stochastic Segment model (SSM). This work is done under the N-best rescoring formalism, where a less complex system than the SSM is used to generate candidate hypotheses which are then rescored and reranked by the SSM. Components of the system that are the focus of this work include estimation of weights for score combination and robust parameter estimation using clustering techniques to model context. In particular, we develop several agglomerative and divisive clustering techniques for multivariate Gaussian distributions, which we use to cluster triphone models. This leads to better estimates with fewer parameters resulting in reduction in word error and storage/computation costs over using unclustered triphones. We also implement an SSM system based on microsegments which combines mixture modeling with trajectory modeling and examine the tradeoffs involved between the allocation ...

