Results 1 -
6 of
6
The Impact Of Speech Recognition On Speech Synthesis
, 2002
"... Speech synthesis has changed dramatically in the past few years to have a corpus-based focus, borrowing heavily from advances in automatic speech recognition. In this paper, we survey technology in speech recognition systems and how it translates (or doesn't translate) to speech synthesis systems. W ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Speech synthesis has changed dramatically in the past few years to have a corpus-based focus, borrowing heavily from advances in automatic speech recognition. In this paper, we survey technology in speech recognition systems and how it translates (or doesn't translate) to speech synthesis systems. We further speculate on future areas where ASR may impact synthesis and vice versa.
Flexible Speech Synthesis Using Weighted Finite State Transducers
, 1996
"... The main focus of this thesis is on improving the quality of concatenative speech synthesis by taking advantage of the natural (allowable) variability in spoken language, namely, the fact that there are multiple ways of uttering a given sentence and there are several word sequences that can represen ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The main focus of this thesis is on improving the quality of concatenative speech synthesis by taking advantage of the natural (allowable) variability in spoken language, namely, the fact that there are multiple ways of uttering a given sentence and there are several word sequences that can represent a given concept. An architecture for speech generation for constrained domain applications is proposed that tightly integrates language generation and speech synthesis, allowing the choice of words and desired intonation in the system's response to be optimized jointly with the speech output quality. Experiments with a travel planning dialog system have demonstrated that by expanding the space of candidate responses and possible prosodic realizations we achieve higher quality speech output.
Bayesian Induction of intonational phrase breaks
"... For the present paper, a Bayesian probabilistic framework for the task of automatic acquisition of intonational phrase breaks was established. By considering two different conditional independence assumptions, the naïve Bayes and Bayesian networks approaches were regarded and evaluated against the C ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
For the present paper, a Bayesian probabilistic framework for the task of automatic acquisition of intonational phrase breaks was established. By considering two different conditional independence assumptions, the naïve Bayes and Bayesian networks approaches were regarded and evaluated against the CART algorithm, which has been previously used with success. A finite length window of minimal morphological and syntactic resources was incorporated, i.e. the POS label and the kind of phrase boundary, a novel syntactic feature that has not been applied to intonational phrase break detection before. This feature can be used in languages where syntactic parsers are not available and proves to be important, not only for the proposed Bayesian methodologies but for other algorithms, like CART. Trained on a 5500 word database, Bayesian networks proved to be the most effective in terms of precision (82,3%) and recall (77,2%) for predicting phrase breaks. 1.
A Data-Driven Framework for Intonational Phrase Break Prediction
"... Abstract. For the present work, we attempt to study the issue of automatic acquisition of intonational phrase breaks. A mathematically well-formed framework is suggested, which is based on Bayesian theory. Based on two different assumptions regarding the conditional independence of the input attribu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. For the present work, we attempt to study the issue of automatic acquisition of intonational phrase breaks. A mathematically well-formed framework is suggested, which is based on Bayesian theory. Based on two different assumptions regarding the conditional independence of the input attributes, we have come up with two Bayesian implementations, namely the Naïve Bayes and the Bayesian Networks classifiers. As a performance benchmark, we evaluated the experimental result against CART, an acclaimed algorithm in the field of intonational phrase break detection that has demonstrated stat-of-the-art figures. Our approach utilizes minimal morphological and syntactic resources in a finite length window, i.e. the POS label and the type of syntactic phrase boundary, a novel attribute that has not been applied to the specific task before. On a 5500 word training set, the Bayesian networks approach proved to be the most effective, depicting precision and recall figures in the range of 82 % and 77 % respectively. 1.
A New Prosodic Phrasing Model for Chinese TTS Systems
- in Proc. 6 th Natural Language Processing Pacific Rim Symposium
, 2001
"... This paper proposes a new prosodic phrasing model for Chinese text-tospeech systems. First, in contrast to the commonly used CART techniques, we propose a new inductive learning algorithm based on the extension matrix theory. Second, we collected 559 sentences (of approximately 78 min length ..."
Abstract
- Add to MetaCart
This paper proposes a new prosodic phrasing model for Chinese text-tospeech systems. First, in contrast to the commonly used CART techniques, we propose a new inductive learning algorithm based on the extension matrix theory. Second, we collected 559 sentences (of approximately 78 min length) from news programs and built a corresponding speech corpus uttered by a professional male announcer. The prosodic boundaries were manually marked and word identification, POS tagging and syntactic analysis were also done on the text. Finally, our model was trained on 371 sentences and tested on 188.
Experimental Evaluation of Tree-Based Algorithms for Intonational Breaks Representation
, 2005
"... The prosodic specification of an utterance to be spoken by a Text-to-Speech synthesis system can be devised in break indices, pitch accents and boundary tones. In particular, the identification of break indices formulates the intonational phrase breaks that affect all the forthcoming prosody-relat ..."
Abstract
- Add to MetaCart
The prosodic specification of an utterance to be spoken by a Text-to-Speech synthesis system can be devised in break indices, pitch accents and boundary tones. In particular, the identification of break indices formulates the intonational phrase breaks that affect all the forthcoming prosody-related procedures. In the

