Results 11 -
17 of
17
Improvement In N-Best Search For Continuous Speech Recognition
"... In this paper, several techniques for reducing the search complexity of beam search for continuous speech recognition task are proposed. Six heuristic methods for pruning are described and the parameters of the pruning are adjusted to keep constant the word error rate while reducing the computationa ..."
Abstract
- Add to MetaCart
In this paper, several techniques for reducing the search complexity of beam search for continuous speech recognition task are proposed. Six heuristic methods for pruning are described and the parameters of the pruning are adjusted to keep constant the word error rate while reducing the computational complexity and memory demand. The evaluation of the effect of each pruning method is performed in Mixture Stochastic Trajectory Model (MSTM). MSTM is a segment-based model using phonemes as the speech units. The set of tests in a speaker-dependent continuous speech recognition task shows that using the pruning methods, a substantial reduction of 67# of search effort is obtained in term of number of hypothesised phonemes during the search. All proposed techniques are independent of the acoustic models and therefore are applicable to other acoustic modeling techniques.
LargevocabuCC, continu,x
, 2002
"... Au,u,4, speech recognition of real-live broadcast news (BN) data(Hu,;: has become a challenging research topic in recent years. This papersur,#CC4; ou key e#orts tobu:6 a largevocabu:6: continu6: speech recognition system for the heterogenou BN taskwithou induuq uduuq6 complexity andcompu4q, ..."
Abstract
- Add to MetaCart
Au,u,4, speech recognition of real-live broadcast news (BN) data(Hu,;: has become a challenging research topic in recent years. This papersur,#CC4; ou key e#orts tobu:6 a largevocabu:6: continu6: speech recognition system for the heterogenou BN taskwithou induuq uduuq6 complexity andcompu4q,x;:# resou4q,x These key e#orts inclu,CC .
Prof. dr. ir. D.A. van Leeuwen Prof. dr. ir. A.P. de Vries
"... promotor assistent-promotor voorzitter en secretaris ..."
Online Adaptive Learning for Speech Recognition Decoding
"... We describe a new method for pruning in dynamic models based on running an adaptive filtering algorithm online during decoding to predict aspects of the scores in the near future. These predictions are used to make well-informed pruning decisions during model expansion. We apply this idea to the cas ..."
Abstract
- Add to MetaCart
We describe a new method for pruning in dynamic models based on running an adaptive filtering algorithm online during decoding to predict aspects of the scores in the near future. These predictions are used to make well-informed pruning decisions during model expansion. We apply this idea to the case of dynamic graphical models and test it on a speech recognition database derived from Switchboard. Results show that significant (approximately factor of 2) speedups can be obtained without any increase in word error rate or increase in memory usage. Index Terms: graphical models, decoding, speech recognition, online learning
The Time-Conditioned Approach in Dynamic Programming Search for LVCSR
"... Abstract—This paper presents the time-conditioned approach in dynamic programming search for large-vocabulary continuousspeech recognition. The following topics are presented: the baseline algorithm, a time-synchronous beam search version, a comparison with the word-conditioned approach, a compariso ..."
Abstract
- Add to MetaCart
Abstract—This paper presents the time-conditioned approach in dynamic programming search for large-vocabulary continuousspeech recognition. The following topics are presented: the baseline algorithm, a time-synchronous beam search version, a comparison with the word-conditioned approach, a comparison with stack decoding. The approach has been successfully tested on the NAB task using a vocabulary of 64 000 words. Index Terms—Beam search, dynamic programming, large vocabulary speech recognition, one-pass DP search, search organization, time-conditioned DP search. I.
Fast N-Gram Language Model Look-Ahead for Decoders With Static Pronunciation Prefix Trees
"... Decoders that make use of token-passing restrict their search space by various types of token pruning. With use of the Language Model Look-Ahead (LMLA) technique it is possible to increase the number of tokens that can be pruned without loss of decoding precision. Unfortunately, for token passing de ..."
Abstract
- Add to MetaCart
Decoders that make use of token-passing restrict their search space by various types of token pruning. With use of the Language Model Look-Ahead (LMLA) technique it is possible to increase the number of tokens that can be pruned without loss of decoding precision. Unfortunately, for token passing decoders that use single static pronunciation prefix trees, full n-gram LMLA increases the needed number of language model probability calculations considerably. In this paper a method for applying full n-gram LMLA in a decoder with a single static pronunciation tree is introduced. The experiments show that this method improves the speed of the decoder without an increase of search errors.
Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
"... In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of decoding by minimizing the number of language model computations and hypothesis expansions. Our results show that language ..."
Abstract
- Add to MetaCart
In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of decoding by minimizing the number of language model computations and hypothesis expansions. Our results show that language model based pre-sorting yields a small improvement in translation quality and a speedup by a factor of 2. Two look-ahead methods are shown to further increase translation speed by a factor of 2 without changing the search space and a factor of 4 with the side-effect of some additional search errors. We compare our approach with Moses and observe the same performance, but a substantially better trade-off between translation quality and speed. At a speed of roughly 70 words per second, Moses reaches 17.2 % BLEU, whereas our approach yields 20.0 % with identical models. 1

