Results 1 -
3 of
3
Finding Consensus in Speech Recognition: Word Error Minimization and Other Applications of Confusion Networks
, 2000
"... We describe a new framework for distilling information from word lattices to improve the accuracy of speech recognition and obtain a more perspicuous representation of a set of alternative hypotheses. In the standard MAP decoding approach the recognizer outputs the string of words corresponding ..."
Abstract
-
Cited by 115 (14 self)
- Add to MetaCart
We describe a new framework for distilling information from word lattices to improve the accuracy of speech recognition and obtain a more perspicuous representation of a set of alternative hypotheses. In the standard MAP decoding approach the recognizer outputs the string of words corresponding to the path with the highest posterior probability given the acoustics and a language model. However, even given optimal models, the MAP decoder does not necessarily minimize the commonly used performance metric, word error rate (WER). We describe a method for explicitly minimizing WER by extracting word hypotheses with the highest posterior probabilities from word lattices. We change the standard problem formulation by replacing global search over a large set of sentence hypotheses with local search over a small set of word candidates. In addition to improving the accuracy of the recognizer, our method produces a new representation of the set of candidate hypotheses that specifies ...
The Effect of Pruning and Compression on Graphical Representations of the Output of a Speech Recognizer
- Origins and Dtrectioto, CH
, 2003
"... Larr vocabular y continuous speech reech ition can benefitfre an e#cient data strR turfor rrR/sentingalarE number of acoustic hypotheses compactly. Wor gr1:1 or lattices have been chosen as such an e#cientinter face between acousticroust ition engines and subsequent languageprguag ing modules. This ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Larr vocabular y continuous speech reech ition can benefitfre an e#cient data strR turfor rrR/sentingalarE number of acoustic hypotheses compactly. Wor gr1:1 or lattices have been chosen as such an e#cientinter face between acousticroust ition engines and subsequent languageprguag ing modules. This paper firR investigates the e#ect ofprEI/-- dur ing acoustic decoding on the quality ofwor lattices and shows that by combiningdi#erEE pre ing options (at the model level and wor level), we can obtain wor lattices withcompar bleaccurE/ to theorRE/ al lattices and a manageable size. In orer to use the wor lattices as the inputfor a post-prt-RI ing language module, they shouldprx--:/1 thetar/E hypotheses andtheir scor while being as small as possible. In this paper weintr oduce awor grC comprmpR/-- algor thm that significantlyrnt ces the number ofwor-- in thegrRxEE alrRx---- entation without eliminatingutter ance hypothesesor distortRI their acousticscort . Wecompar this wor grR comprCx/)R algor thm withsever lother latticesize-rRI cing appr aches and demon strnR thereRx1C-- strx gth of the new wor gr1/ comprw sionalgor:I+ for decr: ing the number ofworC in thereR/) entation. ExperR entsar conductedacrRI corRI/ and vocabular sizes todeterE/R the consistency of theprR/--) and comprC sionrnRIIC) # 2003 Elsevier Science Ltd. AllrlRI srEIE ved. 1.I5k4 Wor latticesar often chosen as theinter/C1 between an acousticrusticRx-- and a subsequent prubsequ using amor complex language model (LM)or mor specific acoustic model because of www.elsevierw.elsevi te/csl COMPUTER SPEECH AND LANGUAGE * Corr)R)R)Rr author Tel.: +1-765-494-3652; fax: +1-765-494-3371. E-mailaddr9(--)b harRxC/1:Rwxxx/Rrx+ yangl@ecn.purxxx/Rr (M.P.Har.RIC mike.johnson@marrx+Rwxx (M.T. Johnson),lhj@ecn.pur)xRwEE...

