Hierarchical search for large vocabulary conversational speech recognition (1999)
| Venue: | IEEE Signal Processing Magazine |
| Citations: | 15 - 5 self |
BibTeX
@ARTICLE{Deshmukh99hierarchicalsearch,
author = {Neeraj Deshmukh and Aravind Ganapathiraju and Joseph Picone},
title = {Hierarchical search for large vocabulary conversational speech recognition},
journal = {IEEE Signal Processing Magazine},
year = {1999},
volume = {16},
pages = {84--107}
}
Years of Citing Articles
OpenURL
Abstract
ABSTRACT 2 Speaker-independent speech recognition technology has made significant progress from the days of isolated word recognition. Today, state-of-the-art systems are capable of performing large vocabulary continuous speech recognition (LVCSR) on audio streams derived from complex information sources such as broadcast news and two-way telephone dialogs. A significant contribution to this advancement in technology is the development of search techniques that find suboptimal but accurate solutions in problems involving large search spaces and extremely complex statistical models. Moreover, these search strategies are capable of dynamically integrating information from a number of diverse knowledge sources to determine the correct word hypothesis, and limit the scope of the search by using a hierarchical search strategy. We refer to this problem as the decoding or search problem. This paper describes the complexity associated with decoding using hierarchical representations for linguistic and acoustic knowledge sources. An extensible object-oriented decoder available in the public domain, that leverages current state-of-the-art technology is described to illustrate these concepts. This decoder supports efficient handling of acoustic models for cross-word contextdependent phones, multiple pronunciations of words using lexical trees, and rescoring of word graphs based on N-gram language models in a single pass. It employs a state-of-the-art Viterbistyle dynamic programming algorithm, and is equipped with several heuristic pruning criteria to minimize the consumption of computational resources while maintaining good accuracy.







