## GENERALIZED WORD POSTERIOR PROBABILITY (GWPP) FOR MEASURING RELIABILITY OF RECOGNIZED WORDS

Citations: | 9 - 5 self |

### BibTeX

@MISC{Soong_generalizedword,

author = {Frank K. Soong and Wai-kit Lo and Satoshi Nakamura},

title = {GENERALIZED WORD POSTERIOR PROBABILITY (GWPP) FOR MEASURING RELIABILITY OF RECOGNIZED WORDS},

year = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

To measure the reliability of recognized words in an ASR, we propose a generalized word posterior probability (GWPP) as the sole confidence measure. This measure is computed efficiently via a word graph with the forwardbackward algorithm or directly with the generalized string likelihoods of N-best strings from the recognizer. The GWPP is a modified word posterior probability where a word event, given all the acoustic observations of an utterance, is measured as a conditional probability. Time registration of the starting and ending frames of a hypothesized word is relaxed, similar to the Baum-Welch model training algorithm, and acoustic and language model weights are optimally adjusted to accommodate instrumental but inaccurate modeling assumptions used in implementing those two models. When tested on the ATR Japanese BTEC speech database, the confidence error rates are significantly reduced as much as 25 % at various operating points. 1

### Citations

185 | Finding consensus in speech recognition: Word error minimization and other applications of confusion networks
- Mangu, Brill, et al.
- 2000
(Show Context)
Citation Context ...ted as fillers. The corresponding word posterior probability of the spotted word is computed. By using this word/filler dichotomy, there is no need to construct a consensus network like the "sausage" =-=[7]-=-, lattice chunking [8] or dynamic programming based string alignment [9]. Relevant issues are first discussed and then investigated, including: (1) reducing search space of the optimal word string, i.... |

102 | Confidence measures for large vocabulary continuous speech recognition
- Wessel, Schlüter, et al.
- 2001
(Show Context)
Citation Context ...ted word is computed. By using this word/filler dichotomy, there is no need to construct a consensus network like the "sausage" [7], lattice chunking [8] or dynamic programming based string alignment =-=[9]-=-. Relevant issues are first discussed and then investigated, including: (1) reducing search space of the optimal word string, i.e., a word graph or an N-best list output by an ASR decoder; (2) relaxin... |

83 |
The N-Best Algorithm: An Efficient and Exact Procedure for Finding the N Most Likely Sentence Hypotheses
- Schwartz, Chow
- 1990
(Show Context)
Citation Context ...ikelihood of the best partial hypothesis is usually imposed to prune out unlikely partial hypotheses; furthermore, hypotheses can be pruned and a highly packed word graph [1] or an N-best string list =-=[2,3]-=- can be generated by keeping only a subset of string hypotheses which are much more likely than other strings. We will be using such a subset, in computing the word posterior probability in later expe... |

81 |
A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition
- Ortmanns, X
- 1997
(Show Context)
Citation Context ...i search, a beam within the likelihood of the best partial hypothesis is usually imposed to prune out unlikely partial hypotheses; furthermore, hypotheses can be pruned and a highly packed word graph =-=[1]-=- or an N-best string list [2,3] can be generated by keeping only a subset of string hypotheses which are much more likely than other strings. We will be using such a subset, in computing the word post... |

78 |
Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversation in the Real World
- Takezawa, Sumita, et al.
- 2002
(Show Context)
Citation Context ...re learned from given training or development data. 5 EXPERIMENTAL SETUP In this study, the proposed GWPP for word acceptance/rejection is tested on the Japanese Basic Travel Expression Corpus (BTEC) =-=[10]-=-. Two testing sets were used, namely set01 and set02 with 510 and 508 utterances, respectively. Each data set contains different utterances recorded from 10 different speakers who are different from o... |

38 | A tree-trellis based fast search for finding the N best sentence hypotheses in continuous speech recognition
- Soong, Huang
- 1990
(Show Context)
Citation Context ...ikelihood of the best partial hypothesis is usually imposed to prune out unlikely partial hypotheses; furthermore, hypotheses can be pruned and a highly packed word graph [1] or an N-best string list =-=[2,3]-=- can be generated by keeping only a subset of string hypotheses which are much more likely than other strings. We will be using such a subset, in computing the word posterior probability in later expe... |

20 |
A general algorithm for word graph matrix decomposition
- Hakkani-Tur, Riccardi
- 2003
(Show Context)
Citation Context ...rresponding word posterior probability of the spotted word is computed. By using this word/filler dichotomy, there is no need to construct a consensus network like the "sausage" [7], lattice chunking =-=[8]-=- or dynamic programming based string alignment [9]. Relevant issues are first discussed and then investigated, including: (1) reducing search space of the optimal word string, i.e., a word graph or an... |

12 |
Spontaneous dialogue speech recognition using cross-word context constrained word graph
- Shimizu, Yamamoto, et al.
(Show Context)
Citation Context ... set contains different utterances recorded from 10 different speakers who are different from one set to the other. The recognition systems used in our experiments is the ATRIUMS Version 2.2 from ATR =-=[11]-=-. Specifically, for our investigation, the LVCSR is configured to generate 100best hypotheses recognition output together with the word graph for every utterance and the search is constrained with a g... |

2 | Confidence Scoring Based on Recognition Engine Julius,” The 2003 Autumn Meeting of the Acoustical Society of Japan - Lee, Shikano, et al. |

1 | Real-Time Confidence Scoring Based on Word Posterior Probability on two-pass search algorithm - Lee, Kawahara, et al. - 2003 |

1 | A Word-spotting Hypothesis Testing for Accepting/Rejecting Continuous Speech Recognition Output
- Soong, Lo, et al.
- 2003
(Show Context)
Citation Context ...lementations and to prevent the word posterior probability from being dominated by just a few top strings with high likelihoods, we modified Eqn.(4) to a generalized word posterior probability (GWPP) =-=[6]-=- as ( 4)sT ([ w; s, t] | x ) p 1 w= wn [ s, t] ∩[ s , t ] ≠φ ∏ α p m= 1 = ∑ M M ,[ w; s, t] 1 1 ∃n, 1≤n≤ M n n M tm β m−1 ( xs | w ) ( | 1 ) m m p wm w T p( x ) T Obviously, the denominator term, ( x ... |