## Network Optimizations for Large Vocabulary Speech Recognition (1998)

Venue: | Speech Communication |

Citations: | 20 - 8 self |

### BibTeX

@ARTICLE{Mohri98networkoptimizations,

author = {Mehryar Mohri and Michael Riley},

title = {Network Optimizations for Large Vocabulary Speech Recognition},

journal = {Speech Communication},

year = {1998},

volume = {25}

}

### OpenURL

### Abstract

The redundancy and the size of networks in large-vocabulary speech recognition systems can have a critical effect on their overall performance. We describe the use of two new algorithms: weighted determinization and minimization [12]. These algorithms transform recognition labeled networks into equivalent ones that require much less time and space in large-vocabulary speech recognition. They are both optimal: weighted determinization eliminates the number of alternatives at each state to the minimum, and weighted minimization reduces the size of deterministic networks to the smallest possible number of states and transitions. These algorithms generalize classical automata determinization and minimization to deal properly with the probabilities of alternative hypotheses and with the relationships between units (distributions, phones, words) at different levels in the recognition system. We illustrate their use in several applications, and report the results of our experiments. Key words...

### Citations

2438 |
The Design and Analysis of Computer Algorithms
- Aho, Hopcroft, et al.
- 1974
(Show Context)
Citation Context ...e have given an on-the-fly implementation of weighted transducer determinization. 2.3 Minimization of Weighted Acceptors Any deterministic finite automaton can be minimized using classical algorithms =-=[1]-=-. In the same way, any deterministic weighted acceptor A can be minimized using our weighted minimization algorithm [12]. The resulting weighted acceptor B is equivalent to the acceptor A. It has the ... |

334 | Regular models of phonological rule systems
- Kaplan, Kay
- 1994
(Show Context)
Citation Context ...direct construction might become tedious. One can then use a set of (weighted) rewrite rules to describe the possible contexts. Rewrite rules can be efficiently compiled into finite-state transducers =-=[7,17]-=-. Any context-dependency transducer, whether directly constructed or compiled from patterns or rules, may benefit from transducer determinization. The transducer of figure 10 is not deterministic. Man... |

303 | Finite-State Transducers in Language and Speech Processing
- Mohri
- 1997
(Show Context)
Citation Context ...f networks in large-vocabulary speech recognition systems can have a critical effect on their overall performance. We describe the use of two new algorithms: weighted determinization and minimization =-=[12]-=-. These algorithms transform recognition labeled networks into equivalent ones that require much less time and space in large-vocabulary speech recognition. They are both optimal: weighted determiniza... |

114 |
Transductions and Context-Free Languages. Teubner Studienbucher
- Berstel
- 1979
(Show Context)
Citation Context ...gorithms that apply to weighted acceptors and finite-state transducers. Finite-state transducers are automata in which each transition has an output label in addition to the more familiar input label =-=[5,4]-=-. Weighted acceptors or transducers are acceptors or transducers in which each transition has a weight as well as the input, or input and output labels. We cannot give a detailed description of these ... |

97 |
Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition
- Lee
- 1990
(Show Context)
Citation Context ...erican Business (NAB) task, and in the DARPA Air Travel Information System (ATIS) task, ffl Context-dependent phone models: context-dependent phone models are very useful in high-accuracy recognition =-=[8,22]-=-. Weighted determinization and minimization can be used to give a very efficient representation of context-dependent models. This is crucial in real-time large-vocabulary speech recognition systems, f... |

80 | Weighted automata in text and speech processing
- Mohri, Pereira, et al.
- 1996
(Show Context)
Citation Context ...in this way can have a high degree of nondeterminism in large vocabulary systems. This remains true even when using the efficient finite-state composition techniques that lead to more compact results =-=[14,21]-=-, rather than simple substitution. The phonemic network can contain states with as many (or more) outgoing arcs as the size of the vocabulary. This large number of alternatives can considerably reduce... |

74 | An efficient compiler for weighted rewrite rules
- Mohri, Sproat
- 1996
(Show Context)
Citation Context ...direct construction might become tedious. One can then use a set of (weighted) rewrite rules to describe the possible contexts. Rewrite rules can be efficiently compiled into finite-state transducers =-=[7,17]-=-. Any context-dependency transducer, whether directly constructed or compiled from patterns or rules, may benefit from transducer determinization. The transducer of figure 10 is not deterministic. Man... |

55 | Minimization Algorithms for Sequential Transducers - Mohri |

52 | A One Pass Decoder Design for Large Vocabulary Recognition - Odell, Valtchev, et al. - 1994 |

47 | On Some Applications of Finite-State Automata Theory to Natural Language Processing
- Mohri
- 1996
(Show Context)
Citation Context ...n of weighted transducers, and the minimization of weighted acceptors. We have given elsewhere a full description of these algorithms, including their mathematical basis and proofs of their soundness =-=[9,10,12,13]-=-. 2.1 Determinization of Weighted Acceptors A weighted acceptor or transducer A is said to be deterministic 2 iff at each state of A there exists at most one transition labeled with any given element ... |

41 | Full expansion of context dependent networks in large vocabulary speech recognition
- Mohri, Riley, et al.
- 1998
(Show Context)
Citation Context ...minization and weighted minimization. We have described elsewhere in great detail a real-time 160,000 word continuous speech recognition system based on the use of weighted transducer determinization =-=[15]-=-. 16 4 Conclusion The use of weighted determinization and minimization in large vocabulary speech recognition leads to very substantial improvements in performance. In addition, it shows the true degr... |

33 | Language-model look-ahead for large vocabulary speech recogntiion - Ortmanns, Ney, et al. - 1996 |

23 | Minimization of Sequential Transducers
- Mohri
- 1994
(Show Context)
Citation Context ...n of weighted transducers, and the minimization of weighted acceptors. We have given elsewhere a full description of these algorithms, including their mathematical basis and proofs of their soundness =-=[9,10,12,13]-=-. 2.1 Determinization of Weighted Acceptors A weighted acceptor or transducer A is said to be deterministic 2 iff at each state of A there exists at most one transition labeled with any given element ... |

22 | Lindam I., “A Comparison of Time Conditioned and Word Conditioned Search Techniques for large vocabulary speech recognition - Ortmanns, Ney, et al. - 1996 |

18 | Language Model Representations for Beam-Search Decoding
- Antoniol, Brugnara, et al.
- 1995
(Show Context)
Citation Context ...phabet considered (words, phonemes, etc.). Other related work has been done to reduce that redundancy using deterministic trees in particular lexical trees or the so called tree-based representations =-=[3,6,18--20]-=-. The weighted determinization algorithm that we present is a very general algorithm that differs from those approaches by the following: it does not require that networks be constructed as trees, it ... |

14 |
A tree search strategy for large-vocabulary continuous speech recognition
- Gopalakrishnan, Bahl, et al.
- 1995
(Show Context)
Citation Context ...phabet considered (words, phonemes, etc.). Other related work has been done to reduce that redundancy using deterministic trees in particular lexical trees or the so called tree-based representations =-=[3,6,18--20]-=-. The weighted determinization algorithm that we present is a very general algorithm that differs from those approaches by the following: it does not require that networks be constructed as trees, it ... |

12 |
Automata, Languages and Machines, volume A–B
- Eilenberg
- 1974
(Show Context)
Citation Context ...gorithms that apply to weighted acceptors and finite-state transducers. Finite-state transducers are automata in which each transition has an output label in addition to the more familiar input label =-=[5,4]-=-. Weighted acceptors or transducers are acceptors or transducers in which each transition has a weight as well as the input, or input and output labels. We cannot give a detailed description of these ... |

12 | Transducer composition for context-dependent network expansion
- Riley, Pereira, et al.
- 1997
(Show Context)
Citation Context ...rd lattice W 2 , including I/O (which take most of this time) on the same computer. 13 3.2 Context-dependent Phone Models Context-dependent phone models can be represented by finite-state transducers =-=[21]-=-. They can be built directly in many cases. The transducer of figure 10 for instance can be constructed directly to represent the inverse of a contextdependency model: it maps phone sequences to seque... |

5 |
Tree-based state-tying for high accuracy acoustic modelling
- Young, Odell, et al.
- 1994
(Show Context)
Citation Context ...erican Business (NAB) task, and in the DARPA Air Travel Information System (ATIS) task, ffl Context-dependent phone models: context-dependent phone models are very useful in high-accuracy recognition =-=[8,22]-=-. Weighted determinization and minimization can be used to give a very efficient representation of context-dependent models. This is crucial in real-time large-vocabulary speech recognition systems, f... |

1 |
Finite-State Language Processing, chapter On The Use of Sequential Transducers
- Mohri
- 1997
(Show Context)
Citation Context ...lization of the classical determinization of automata [2]. Unlike the classical case though, not all weighted automata can 2 The appropriate term used in theoretical computer science is subsequential =-=[11]-=-. 3 0 1 a/1 b/2 c/5 2 a/3 b/4 c/7 d/8 e/9 3/0 f/9 e/8 e/10 e/11 f/12 f/13 Fig. 1. Non-deterministic weighted automaton A 1 . be determinized. 3 However, most weighted acceptors used in speech processi... |