Results 1 -
5 of
5
Dynaspeak: SRI’s scalable speech recognizer for embedded and mobile systems
- in Proceedsings of HLT
, 2002
"... We introduce SRI’s new speech recognition engine, DynaSpeak TM, which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We introduce SRI’s new speech recognition engine, DynaSpeak TM, which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on integer arithmetic. These features are designed to address the needs of the fast-developing and changing domain of embedded and mobile computing platforms.
Improved Modeling and Efficiency for Automatic Transcription of Broadcast News
, 2000
"... Over the last few years, the DARPA-sponsored Hub4 continuous speech recognition evaluations have pushed speech recognition technology for the very interesting and difficult task of automatically transcribing broadcast news. In this paper, we report on our research and progress on this problem. We fo ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Over the last few years, the DARPA-sponsored Hub4 continuous speech recognition evaluations have pushed speech recognition technology for the very interesting and difficult task of automatically transcribing broadcast news. In this paper, we report on our research and progress on this problem. We focus on individual techniques we developed, rather than on descriptions of our evaluation systems. We provide comparative experimental results showing the improvements obtained with the novel approaches we developed. 1 Introduction In recent years there has been increasing interest in developing large-vocabulary continuous speech recognition (LVCSR) systems for speech found in real sources. Broadcast news, in particular, has been the testbed for the DARPA-sponsored Hub4 continuous speech recognition (CSR) evaluations over the last few years, and represents a significant challenge to speech recognition researchers. Many interesting problems are associated with the automatic recognition of b...
Parameter Tying and Gaussian Clustering for Faster, Better, and Smaller Speech Recognition
, 1999
"... We present a new view of hidden Markov model (HMM) state tying, showing that the accuracy of phonetically tied mixture (PTM) models is similar to, or better than, that of the more typical stateclustered HMM systems. The PTM models require fewer Gaussian distance computations during recognition, and ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present a new view of hidden Markov model (HMM) state tying, showing that the accuracy of phonetically tied mixture (PTM) models is similar to, or better than, that of the more typical stateclustered HMM systems. The PTM models require fewer Gaussian distance computations during recognition, and can lead to recognition speedups. We describe a per-phone Gaussian clustering algorithm that automatically determines the number of Gaussians for each phone in the PTM model. Experimental results show that this method gives a substantial decrease in the number of Gaussians and a corresponding speedupwith little degradation in accuracy. Finally, we study mixture weight thresholding algorithms to drastically decrease the number of mixture weights in the PTM model without degrading accuracy. More than a factor of 10 reduction in mixture weights is achieved with no degradation in performance. 1. Introduction In most state-of-the-art hidden Markov model (HMM)-based speech recognition systems, H...
SRI's 1998 Broadcast News System -- Toward Faster, Better, Smaller Speech Recognition
- In Proceedings of the DARPA Broadcast News Workshop
, 1999
"... We describe several new research directions we investigated toward the development of our broadcast news transcription system for the 1998 DARPA H4 evaluations. Our goal was to develop significantly faster and smaller speech recognition systems without degrading the word error rate of our 1997 syste ..."
Abstract
- Add to MetaCart
We describe several new research directions we investigated toward the development of our broadcast news transcription system for the 1998 DARPA H4 evaluations. Our goal was to develop significantly faster and smaller speech recognition systems without degrading the word error rate of our 1997 system. We did this through significant algorithmic research creating various new techniques. A sample of these techniques was used to put together our 1998 broadcast news system, which is conceptually much simpler, faster, and smaller, but gives the same word error rate as our 1997 system. In particular, our 1998 system is based on a simple phonetically tied mixture (PTM) model with a total of only 13,000 Gaussians, as compared to a 67,000-Gaussian state-clustered system we used in 1997. 1. Introduction One of our main goals in 1998 was to significantly increase speed and decrease model size, while maintaining or improving accuracy. These goals are difficult to achieve simultaneously because o...
DynaSpeak: SRI's Scalable Speech Recognizer for
- in Proceedsings of HLT
, 2002
"... We introduce SRI's new speech recognition engine, , which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on i ..."
Abstract
- Add to MetaCart
We introduce SRI's new speech recognition engine, , which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability, efficient grammar optimization, support for natural language parsing functionality, and operation based on integer arithmetic. These features are designed to address the needs of the fast-developing and changing domain of embedded and mobile computing platforms.

