## Iterative posteriorbased keyword spotting without filler models

Venue: | In Proceedings of the IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing |

Citations: | 10 - 3 self |

### BibTeX

@INPROCEEDINGS{Silaghi_iterativeposteriorbased,

author = {Marius-călin Silaghi and Hervé Bourlard},

title = {Iterative posteriorbased keyword spotting without filler models},

booktitle = {In Proceedings of the IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing},

year = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper addresses the problem of detecting keywords in unconstrained speech without explicit modeling of nonkeyword segments. The proposed algorithm is based on recent developments in confidence measures using local posterior probabilities, and searches for the segment maximizing the average observation posterior 1 along the most likely path in the hypothesized keyword model. 2 As known, this approach (sometimes referred to as sliding model method) requires a relaxation of the begin/endpoints of the Viterbi matching, as well as a time normalization of the resulting score, making dynamic programming sub-optimal or more complex (more computation and/or more memory). We present here an alternative (quite simple and efficient) solution to this problem, using an iterative form of Viterbi decoding algorithm, but which does not require scoring for all possible begin/endpoints. Convergence proof of this algorithm is available [8]. Results obtained with this method on 100 keywords chosen at random from the BREF database [5] are reported. 1.

### Citations

473 |
Connectionist Speech Recognition – A Hybrid Approach
- Bourlard, Morgan
- 1994
(Show Context)
Citation Context ..., as optimized in [3], but these have not been optimized here. In our case, local posteriors P (qℓ|xn) were estimated as output values of a multilayer perceptron (MLP) used in a hybrid HMM/ANN system =-=[2]-=-. For a specific sub-sequence X e b , expression (1) can easily be estimated by dynamic programming since the subsequence and the associated normalizing factor (e − b + 1) are given. However, in the c... |

79 | a large vocabulary spoken corpus for French
- Lamel, Gauvain, et al.
- 1991
(Show Context)
Citation Context ...s not require scoring for all possible begin/endpoints. Convergence proof of this algorithm is available [8]. Results obtained with this method on 100 keywords chosen at random from the BREF database =-=[5]-=- are reported. 1. INTRODUCTION This paper addresses the problem of keyword spotting (KWS) in unconstrained speech without explicit modeling of non-keyword segments (typically done by using filler HMM ... |

77 |
D.B.: A hidden Markov model based keyword recognition system
- Rose, Paul
- 1990
(Show Context)
Citation Context ...ptimization of (2) as, e.g., in [4, 11], most of the keyword spotting approaches today prefer to preserve the optimality and simplicity of Viterbi DP by modeling the complete input [6] and explicitly =-=[7]-=- or implicitly [3] modeling non-keyword segments by using so called filler or garbage models as additional reference models. In this case, we assume that non-keyword segments are modeled by extraneous... |

39 |
Optimizing recognition and rejection performance in wordspotting systems
- Bourlard, D’hoore, et al.
- 1994
(Show Context)
Citation Context ... simply used here as the non-emitting initial and final state of M. Transition probabilities P (q b |qG) and P (qG|q e ) can be interpreted as the keyword entrance and exit penalties, as optimized in =-=[3]-=-, but these have not been optimized here. In our case, local posteriors P (qℓ|xn) were estimated as output values of a multilayer perceptron (MLP) used in a hybrid HMM/ANN system [2]. For a specific s... |

37 |
Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
- Sukkar, Lee
- 1996
(Show Context)
Citation Context ...involved the estimation and rescoring of N-best hypotheses. Similar work and conclusions (also using N-best rescoring) were also reported in using likelihood ratio rescoring and non-keyword rejection =-=[9]-=-. In this paper, we will use a similar scoring technique for keyword spotting without explicit filler model. Compared to previously devised “sliding model” methods (such as [4, 11]), the algorithm pro... |

29 | Confidence measures for hybrid HMM/ANN speech recognition
- Williams, Renals
- 1997
(Show Context)
Citation Context ...good confidence measures and good scores for the reestimation of N-best hypotheses. Similar work, where this kind of confidence measure was compared to several alternative approaches, was reported in =-=[10]-=- and confirmed this conclusion. However, so far, the evaluation of such confidence measures involved the estimation and rescoring of N-best hypotheses. Similar work and conclusions (also using N-best ... |

7 |
An efficient elastic-template method for detecting given words in running speech
- Bridle
- 1973
(Show Context)
Citation Context ...or independent phone models without lexical constraints). Although several algorithms 3 tackling this type of problem have already been proposed in the past, e.g., by using Dynamic Time Warping (DTW) =-=[4]-=- or Viterbi matching [11] allowing relaxation of the (begin and endpoint) constraints, these are known to require the use of an “appropriate” normalization of the matching scores since segments of dif... |

5 |
Application of hidden Markov models of keywords in unconstrained speech
- Wilpon, Rabiner, et al.
- 1989
(Show Context)
Citation Context ...els without lexical constraints). Although several algorithms 3 tackling this type of problem have already been proposed in the past, e.g., by using Dynamic Time Warping (DTW) [4] or Viterbi matching =-=[11]-=- allowing relaxation of the (begin and endpoint) constraints, these are known to require the use of an “appropriate” normalization of the matching scores since segments of different lengths have then ... |

4 |
Improving posterior-based confidence measures in hybrid HMM/ANN speech recognition systems
- Bernardis, Bourlard
- 1998
(Show Context)
Citation Context ...or more advanced scoring criteria (such as the confidence measures mentioned below). More recently, work in the field of confidence level, and in the framework of hybrid HMM/ANN systems, it was shown =-=[1]-=- that the use of accumulated local posterior probabilities (as obtained at the output of a multilayer perceptron) normalized by the length of the word segment (or, better, involving a double normaliza... |

4 |
Word spotting
- Rohlicek
- 1995
(Show Context)
Citation Context ...owards the direct optimization of (2) as, e.g., in [4, 11], most of the keyword spotting approaches today prefer to preserve the optimality and simplicity of Viterbi DP by modeling the complete input =-=[6]-=- and explicitly [7] or implicitly [3] modeling non-keyword segments by using so called filler or garbage models as additional reference models. In this case, we assume that non-keyword segments are mo... |

1 | Posterior-Based Keyword Spotting Approaches Without Filler Models
- Silaghi, Bourlard
- 1999
(Show Context)
Citation Context ... solution to this problem, using an iterative form of Viterbi decoding algorithm, but which does not require scoring for all possible begin/endpoints. Convergence proof of this algorithm is available =-=[8]-=-. Results obtained with this method on 100 keywords chosen at random from the BREF database [5] are reported. 1. INTRODUCTION This paper addresses the problem of keyword spotting (KWS) in unconstraine... |