Results 1 -
5 of
5
Joint processing and discriminative training for letter-to-phoneme conversion
- In Proc. ACL
, 2008
"... We present a discriminative structureprediction model for the letter-to-phoneme task, a crucial step in text-to-speech processing. Our method encompasses three tasks that have been previously handled separately: input segmentation, phoneme prediction, and sequence modeling. The key idea is online di ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
We present a discriminative structureprediction model for the letter-to-phoneme task, a crucial step in text-to-speech processing. Our method encompasses three tasks that have been previously handled separately: input segmentation, phoneme prediction, and sequence modeling. The key idea is online discriminative training, which updates parameters according to a comparison of the current system output to the desired output, allowing us to train all of our components together. By folding the three steps of a pipeline approach into a unified dynamic programming framework, we are able to achieve substantial performance gains. Our results surpass the current state-of-the-art on six publicly available data sets representing four different languages. 1
Learning Pronunciation Dictionaries: Language Complexity and Word Selection Strategies
- Proceedings of the Human Language Technology Conference of the NAACL
, 2006
"... The speed with which pronunciation dictionaries can be bootstrapped depends on the efficiency of learning algorithms and on the ordering of words presented to the user. This paper presents an active-learning word selection strategy that is mindful of human limitations. Learning rates approach that o ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
The speed with which pronunciation dictionaries can be bootstrapped depends on the efficiency of learning algorithms and on the ordering of words presented to the user. This paper presents an active-learning word selection strategy that is mindful of human limitations. Learning rates approach that of an oracle system that knows the final LTS rule set. 1
SPICE: Web-based Tools for Rapid Language Adaptation
- in Speech Processing Systems", In the Proceedings of INTERSPEECH
, 2007
"... In this paper we describe the design and implementation of a user interface for SPICE, a web-based toolkit for rapid prototyping of speech and language processing components. We report on the challenges and experiences gathered from testing these tools in an advanced graduate hands-on course, in whi ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
In this paper we describe the design and implementation of a user interface for SPICE, a web-based toolkit for rapid prototyping of speech and language processing components. We report on the challenges and experiences gathered from testing these tools in an advanced graduate hands-on course, in which we created speech recognition, speech synthesis, and smalldomain translation components for 10 different languages within only 6 weeks.
Challenges with Rapid Adaptation of Speech Translation Systems to New Language Pairs
- In the Proceedings of ICASSP
, 2006
"... Although we have far from solved the issues in porting speech translation systems to new languages, we have gathered sufficient experience by now to identify a number of major challenges in the process. Although well-defined processes exist for building speech recognition, speech synthesis and stati ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Although we have far from solved the issues in porting speech translation systems to new languages, we have gathered sufficient experience by now to identify a number of major challenges in the process. Although well-defined processes exist for building speech recognition, speech synthesis and statistical machine translation models, they still require both significant native speaker involvement and linguistic expertise. As the core technology improves we believe we will see increasing cultural and social issues in contributions from native speakers. This paper identifies some of these issues and presents our initial attempts to build tools that we hope will eventually allow linguistically naive native informants build complete speech translation systems. 1.

