Results 1 -
4 of
4
The Emergence of Phonology from the Interplay of Speech Comprehension and Production: A Distributed Connectionist Approach
- IN B. MACWHINNEY
, 1998
"... ..."
Learning to speak. Sensori-motor control of speech movements
, 1998
"... This paper shows how an articulatory model, able to produce acoustic signals from articulatory motion, can learn to speak, i.e. coordinate its movements in such a way that it utters meaningful sequences of sounds belonging to a given language. This complex learning procedure is accomplished in four ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
This paper shows how an articulatory model, able to produce acoustic signals from articulatory motion, can learn to speak, i.e. coordinate its movements in such a way that it utters meaningful sequences of sounds belonging to a given language. This complex learning procedure is accomplished in four major steps: (a) a babbling phase, where the device builds up a model of the forward transforms, i.e. the articulatory-to-audio-visual mapping; (b) an imitation stage, where it tries to reproduce a limited set of sound sequences by audio-visual-to-articulatory inversion; (c) a "shaping" stage, where phonemes are associated with the most efficient available sensori-motor representation; and finally, (d) a "rhythmic" phase, where it learns the appropriate coordination of the activations of these sensori-motor targets.
Learning To Speak: Speech Production And Sensori-Motor Representations
, 1997
"... This chapter describes how an artificial device, able to produce acoustic signals from articulatory motion, can learn to speak, i.e. coordinate its articulatory movements in such a way that it utters meaningful sequences of sounds belonging to a given language. This complex learning procedure, accom ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This chapter describes how an artificial device, able to produce acoustic signals from articulatory motion, can learn to speak, i.e. coordinate its articulatory movements in such a way that it utters meaningful sequences of sounds belonging to a given language. This complex learning procedure, accomplished within a few years by the human child, is simulated in four major steps: (a) a babbling phase, where the device builds up a model of the forward kinematics, i.e. the articulatory-to-audio-visual mapping; (b) an imitation stage, where it tries to reproduce a limited set of sound sequences by audio-visual-to-articulatory inversion including a normalization procedure; (c) a "shaping" stage, where phonemes are associated with sensori-motor representation; and finally, (d) a "rhythmic" phase, where it learns the appropriate coordination of the activations of these sensori-motor targets. This artificial device has thus an ear which delivers both the control signals and the identification of pe...
Data-driven articulatory inversion incorporating articulator priors
"... Recovering the motions of speech articulators from the acoustic speech signal has a long history, starting from the observation that a simple concatenated tube model is a reasonable model for the origin of formant resonances. In this work, we take a different approach making minimal assumptions abou ..."
Abstract
- Add to MetaCart
Recovering the motions of speech articulators from the acoustic speech signal has a long history, starting from the observation that a simple concatenated tube model is a reasonable model for the origin of formant resonances. In this work, we take a different approach making minimal assumptions about the interdependence of acoustics and articulators by estimating the full joint distribution of the two spaces based on a corpus of paired data, derived from an articulatory synthesizer. This approach allows us to estimate posterior distributions of articulator state as well as finding the maximum-likelihood trajectories. We present examples comparing this approach to a related, earlier approach that did not incorporate prior distributions over articulator space, and demonstrate the advantages of learning the models from realistic utterances. We also indicate benefits available from jointly estimating particular pairs of articulators that have high mutual dependence. Index Terms: articulatory inversion, speech acoustics 1.

