Results 1 -
5 of
5
The Use of Context in Large Vocabulary Speech Recognition
, 1995
"... decide which contexts are similar and can share parameters. A key feature of this approach is that it allows the construction of models which are dependent upon contextual effects occurring across word boundaries. The use of cross word context dependent models presents problems for conventional dec ..."
Abstract
-
Cited by 93 (0 self)
- Add to MetaCart
decide which contexts are similar and can share parameters. A key feature of this approach is that it allows the construction of models which are dependent upon contextual effects occurring across word boundaries. The use of cross word context dependent models presents problems for conventional decoders. The second part of the thesis therefore presents a new decoder design which is capable of using these models efficiently. The decoder is suitable for use with very large vocabularies and long span language models. It is also capable of generating a lattice of word hypotheses with little computational overhead. These lattices can be used to constrain further decoding, allowing efficient use of complex acoustic and language models. The effectiveness of these techniques has been assessed on a variety of large vocabulary continuous speech recognition tasks and results are presented which analyse performance in terms of computational complexity and recognition accuracy. The experiments dem
A New Approach To Generalized Mixture Tying For Continuous HMM-Based Speech Recognition
- Proc. EUROSPEECH, Rhodes
, 1997
"... In this paper we present a new approach for a generalized tying of mixture components for continuous mixture-density HMM-based speech recognition systems. With an iterative pruning and splitting procedure for the mixture components, this approach offers a very accurate and detailed representation of ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
In this paper we present a new approach for a generalized tying of mixture components for continuous mixture-density HMM-based speech recognition systems. With an iterative pruning and splitting procedure for the mixture components, this approach offers a very accurate and detailed representation of the acoustic space and at the same time keeps the number of parameters reasonably small in favor of a robust parameter estimation and a fast decoding. Contrary to other approaches, it does not require a strict clustering of the pdfs into subsets that share their mixture components, so that it is capable of providing more general and flexible types of mixture tying. We applied the new approach on a semi-continuous HMM (SCHMM)-system for the Resource Management task and improved its recognition performance by 12% and vastly accelerated the decoding because of a much faster likelihood computation. 1. INTRODUCTION In continuous mixture-density HMM-based speech recognition systems the HMM stat...
Segmentation And Classification Of Hand-Drawn Pictograms In Cluttered Scenes - An Integrated Approach
, 1999
"... In this paper, a new approach to identification of handwritten symbols in arbitrary complex environments is presented. 20 different pictograms drawn in different backgrounds can be identified with a recognition accuracy of 90%. In order to perform this challenging task, we use pattern spotting techn ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In this paper, a new approach to identification of handwritten symbols in arbitrary complex environments is presented. 20 different pictograms drawn in different backgrounds can be identified with a recognition accuracy of 90%. In order to perform this challenging task, we use pattern spotting techniques based on pseudo 2-D Hidden Markov Models (P2DHMMs). Practical applications of our approach can be found in many typical mulitmedia document processing tasks, such as localization and recognition of non-rigid objects in image databases, detection of objects in complex scenes, finding trademarks in presence of clutter within videos, processing distorted document images in digital libraries, or content-based image retrieval based on handwritten query symbols. 1. INTRODUCTION Segmentation is the first essential and important step of low level vision and can be described as a process of partitioning an image into some non-intersecting regions such that each region is homogeneous and the u...
Refining Tree-Based State Clustering by Means of Formal Concept Analysis, Balanced Decision Trees and Automatically Generated Model-Sets
- In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP99
, 1999
"... Decision tree-based state clustering has emerged in recent years as the most popular approach for clustering the states of context dependent hidden Markov model based speech recognizers. The application of sets of phones, mainly phonetically motivated, that limit the possible clusters, results in a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Decision tree-based state clustering has emerged in recent years as the most popular approach for clustering the states of context dependent hidden Markov model based speech recognizers. The application of sets of phones, mainly phonetically motivated, that limit the possible clusters, results in a reasonably good modeling of unseen phones while it still enables to model specific phones very precisely whenever this is necessary and enough training data is available. Formal Concept Analysis, a young mathematical discipline, provides means for the treatment of sets and sets of sets that are well suited for further improving tree-based state clustering. The possible refinements are outlined and evaluated in this paper. The major merit is the proposal of procedures for the adaptation of the number of sets used for clustering to the amount of available training data, and of a method that generates suitable sets automatically without the incorporation of additional knowledge. 1. INTRODUCTIO...
Parameter Tying For Flexible Speech Recognition
, 1996
"... This paper presents two parameter tying techniques which enable a trade-off between computational cost and recognition performances of a speaker independent flexible speech recognition system working over the telephone network. Parameter tying is conducted at phonetic and acoustic levels. ..."
Abstract
- Add to MetaCart
This paper presents two parameter tying techniques which enable a trade-off between computational cost and recognition performances of a speaker independent flexible speech recognition system working over the telephone network. Parameter tying is conducted at phonetic and acoustic levels.

