Results 1 -
3 of
3
Learning to speak. Sensori-motor control of speech movements
, 1998
"... This paper shows how an articulatory model, able to produce acoustic signals from articulatory motion, can learn to speak, i.e. coordinate its movements in such a way that it utters meaningful sequences of sounds belonging to a given language. This complex learning procedure is accomplished in four ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
This paper shows how an articulatory model, able to produce acoustic signals from articulatory motion, can learn to speak, i.e. coordinate its movements in such a way that it utters meaningful sequences of sounds belonging to a given language. This complex learning procedure is accomplished in four major steps: (a) a babbling phase, where the device builds up a model of the forward transforms, i.e. the articulatory-to-audio-visual mapping; (b) an imitation stage, where it tries to reproduce a limited set of sound sequences by audio-visual-to-articulatory inversion; (c) a "shaping" stage, where phonemes are associated with the most efficient available sensori-motor representation; and finally, (d) a "rhythmic" phase, where it learns the appropriate coordination of the activations of these sensori-motor targets.
Articulatory Synthesis From X-Rays And Inversion For An Adaptive Speech Robot
- In Proceedings of ICSLP'96: The Fourth International Conference on Spoken Language Processing
, 1996
"... This paper describes a speech robotic approach to articulatory synthesis. An anthropomorphic speech robot has been built, based on a real reference subject's data. This speech robot, called the Articulotron, has a set of relevant degrees of freedom for speech articulators, jaw, tongue, lips, and lar ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper describes a speech robotic approach to articulatory synthesis. An anthropomorphic speech robot has been built, based on a real reference subject's data. This speech robot, called the Articulotron, has a set of relevant degrees of freedom for speech articulators, jaw, tongue, lips, and larynx. The associated articulatory model has been elaborated from cineradiographic midsagittal profiles recorded in synchrony with front lips views; the model of noise source for fricative excitation has been derived from acoustic and aerodynamic measurements on the same reference subject. In a first phase, the Articulotron has been used to perform the copy synthesis of the vowels, fricative and plosive consonants in the X-ray corpus. This allows to assess the performance of the Articulotron in producing fairly high quality speech, and provides a reference against which other attempts of articulatory synthesis can be compared. In a second phase, the Articulotron has be used to recover articulatory gestures from audio-visual speech prototypes. At the present stage, a gradient descent algorithm is used to learn the articulatory trajectories of the robot by optimisation, starting from the formant trajectories and the knowledge of constraints for the consonantal constriction or closure, in order to mimic the original VCV audio-visual sequences. The adaptive skill of the robot is demonstrated through articulator perturbation experiments and through the elaboration of relevant strategies in the hyper/hypo speech paradigm. A video tape will demonstrate an animation of the Articulotron, displaying the jaw, the tongue and the lips, for various examples of adaptive articulatory synthesis.
Articulatory Synthesis Of Fricative Consonants: Data And Models
- In ETRW on Speech Production: from Control Strategies to acoustics
, 1996
"... The present work aims at demonstrating the feasibility of high quality articulatory synthesis for fricative consonants, and in particular to match a given reference subject. The synthesiser includes an articulatory model based on cineradiographic pictures of the subject, and a simplified aerodynamic ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The present work aims at demonstrating the feasibility of high quality articulatory synthesis for fricative consonants, and in particular to match a given reference subject. The synthesiser includes an articulatory model based on cineradiographic pictures of the subject, and a simplified aerodynamic model. Two approaches have been used: direct articulatory copy synthesis, and copy synthesis by acoustic-toarticulatory inversion. Coordination between supralaryngeal and laryngeal articulators has been quasi-automatically determined, based on supplementary aerodynamic data. A set of VFV spatiotemporal examplars has finally been built, and should serve to establish sensory-motor templates for synthesis. Introduction We believe that articulatory synthesis is a promising approach to speech synthesis, because its anthropomorphic nature allows to adapt, in a coherent fashion, the synthesis strategies to the environmental conditions. The present work aimed thus at demonstrating the feasibility...

