Results 1 - 10
of
12
Grapheme Based Speech Recognition
- in Proceedings of the EUROSPEECH
, 2003
"... Large vocabulary speech recognition systems traditionally represent words in terms of subword units, usually phonemes. This paper investigates the potential of graphemes acting as subunits. In order to develop context dependent grapheme based speech recognizers several decision tree based clustering ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Large vocabulary speech recognition systems traditionally represent words in terms of subword units, usually phonemes. This paper investigates the potential of graphemes acting as subunits. In order to develop context dependent grapheme based speech recognizers several decision tree based clustering procedures are performed and compared to each other. Grapheme based speech recognizers in three languages - English, German, and Spanish - are trained and compared to their phoneme based counterparts. The results show that for languages with a close grapheme-to-phoneme relation, grapheme based modeling is as good as the phoneme based one. Furthermore, multilingual grapheme based recognizers are designed to investigate whether grapheme based information can be successfully shared among languages. Finally, some bootstrapping experiments for Swedish were performed to test the potential for rapid language deployment.
Context-Dependent Acoustic Modeling Using Graphemes For Large Vocabulary Speech Recognition
- in Proceedings the ICASSP
, 2002
"... In this paper we propose to use a decision tree based on graphemic acoustic sub-word units together with phonetic questions. We also show that automatic question generation can be used to completely eliminate any manual effort. ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
In this paper we propose to use a decision tree based on graphemic acoustic sub-word units together with phonetic questions. We also show that automatic question generation can be used to completely eliminate any manual effort.
Multilingual Acoustic Modeling Using Graphemes
- IN PROCEEDINGS OF EUROPEAN CONFERENCE ON SPEECH COMMUNICATION AND TECHNOLOGY
, 2003
"... In this paper we combine grapheme-based sub-word units with multilingual acoustic modeling. We show that a global decision tree together with automatically generated grapheme questions eliminate manual effort completely. We also investigate the effects of additional language questions. We present ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper we combine grapheme-based sub-word units with multilingual acoustic modeling. We show that a global decision tree together with automatically generated grapheme questions eliminate manual effort completely. We also investigate the effects of additional language questions. We present
Acoustic phonetic modelling using local codebook features
- in Proc. ICSLP’04, Jeju, Korea
, 2004
"... In this article we present an alternative method for defining the question set used for the induction of acoustic phonetic decision trees. The method is data driven and employs local similarities between the probability density functions of hidden Markov models. The method is shown to work at least ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this article we present an alternative method for defining the question set used for the induction of acoustic phonetic decision trees. The method is data driven and employs local similarities between the probability density functions of hidden Markov models. The method is shown to work at least as well as the standard method using question sets devised by human experts. 1.
Local codebook features for mono- and multilingual acoustic phonetic modelling
- in Proc. AST’04
"... Abstract. In this article we present an alternative method for defining the question set used for the induction of acoustic phonetic decision trees. The method is data driven and employs local similarities between the probability density functions of hidden Markov models. We apply the method to mono ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In this article we present an alternative method for defining the question set used for the induction of acoustic phonetic decision trees. The method is data driven and employs local similarities between the probability density functions of hidden Markov models. We apply the method to mono- and multilingual acoustic phonetic modelling, showing that comparable results to the standard method, using question sets devised by human experts, can be derived. 1 1
Letter-based speech synthesis
"... Initial attempts at performing text-to-speech conversion based on standard orthographic units are presented, forming part of a larger scheme of training TTS systems on features that can be trivially extracted from text. We evaluate the possibility of using the technique of decision-tree-based contex ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Initial attempts at performing text-to-speech conversion based on standard orthographic units are presented, forming part of a larger scheme of training TTS systems on features that can be trivially extracted from text. We evaluate the possibility of using the technique of decision-tree-based context clustering conventionally used in HMM-based systems for parametertying to handle letter-to-sound conversion. We present the application of a method of compound-feature discovery to corpusbased speech synthesis. Finally, an evaluation of intelligibility of letter-based systems and more conventional phoneme-based systems is presented. Index Terms: Statistical parametric speech synthesis, HMMbased speech synthesis, letter-to-sound conversion, graphemes.
Adapting Phonetic Decision Trees Between Languages For Continuous Speech Recognition
, 2000
"... In a continuous speech recognition system it is important to model the context dependent variations in the pronunciations of phones. In this work we have attempted to build decision trees for modeling phonetic context-dependency in Hindi. The approach followed is to modify a decision tree built to m ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In a continuous speech recognition system it is important to model the context dependent variations in the pronunciations of phones. In this work we have attempted to build decision trees for modeling phonetic context-dependency in Hindi. The approach followed is to modify a decision tree built to model context-dependency in American English. The reason the decision trees turn out to be different are that the English and Hindi phoneme sets are not identical. Then even for identical phonemes, the context-dependency is different for the two languages. Linguistic-Phonetic knowledge of Hindi is used to modify the English phone set. Since the Hindi phone set being used is derived from the English phone set, the adaptation of the English tree to Hindi follows naturally. Though here the adaptation is from English to Hindi, the method may be applicable for adapting between any two languages. The decision tree is built using either Hindi data or English data labeled with the correct Hindi contexts. This procedure is discussed and the limitations of both the methods are described. 1.
Decision Tree-Based State Tying For Acoustic Modeling
, 1999
"... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3. ACOUSTIC MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..."
Abstract
- Add to MetaCart
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3. ACOUSTIC MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3.1. Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3.2. Parameter Tying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4. DECISION TREE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4.1. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.2. Likelihood Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.3. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
Accessing Language Specific Linguistic Information for Triphone Model
- In Proc. The 2nd Language & Technology Conference: Human Language Technologies
, 2005
"... This paper is concerned with a novel methodology for generating phonetic questions used in tree-based state tying for speech recognition. In order to implement a speech recognition system, language-dependent knowledge which goes beyond annotated material is usually required. The approach presen ..."
Abstract
- Add to MetaCart
This paper is concerned with a novel methodology for generating phonetic questions used in tree-based state tying for speech recognition. In order to implement a speech recognition system, language-dependent knowledge which goes beyond annotated material is usually required. The approach presented here generates phonetic questions for decision trees are based on a feature table that summarizes the articulatory characteristics of each sound. On the one hand, this method allows better language-specific triphone models to be defined given only a feature-table as linguistic input. On the other hand, the feature-table approach facilitates efficient definition of triphone models for other languages since again only a feature table for this language is required. The approach is exemplified with speech recognition systems for English and Thai.
Feature-Table-Based Automatic Question Generation
"... This paper presents a system for automatically generating linguistic questions based on a feature table. Such questions are an essential input for tree-based state tying, a technique which is widely used in speech recognition. In general, in order to utilize this technique, linguistic (or more accur ..."
Abstract
- Add to MetaCart
This paper presents a system for automatically generating linguistic questions based on a feature table. Such questions are an essential input for tree-based state tying, a technique which is widely used in speech recognition. In general, in order to utilize this technique, linguistic (or more accurately phonetic) questions have to be carefully defined. This may be extremely time consuming and require a considerable amount of resources. The system proposed in this paper provides a more elegant and efficient way to generate a set of questions from a simple feature table of the type employed in phonetic studies.

