Results 1 -
3 of
3
Contents lists available at ScienceDirect Journal of Phonetics
"... journal homepage: www.elsevier.com/phonetics ..."
Session 2: Tuning Communication in Face-to-Face Interaction Speech Structure Acquisition for Interactive Systems
"... Robots in social interaction need to be able to communicate with humans in a human-like manner by understanding and using their language. One important first element of speech and language understanding is the ability to parse spoken utterances into words. But this ability is not innate and needs to ..."
Abstract
- Add to MetaCart
Robots in social interaction need to be able to communicate with humans in a human-like manner by understanding and using their language. One important first element of speech and language understanding is the ability to parse spoken utterances into words. But this ability is not innate and needs to be developed by infants within the first years of their life. So far almost all computational
Production knowledge in the recognition of dysarthric speech
, 2011
"... Millions of individuals have acquired or have been born with neuro-motor conditions that limit the control of their muscles, including those that manipulate the articulators of the vocal tract. These conditions, collectively called dysarthria, result in speech that is very difficult to understand, d ..."
Abstract
- Add to MetaCart
Millions of individuals have acquired or have been born with neuro-motor conditions that limit the control of their muscles, including those that manipulate the articulators of the vocal tract. These conditions, collectively called dysarthria, result in speech that is very difficult to understand, despite being generally syntactically and semantically correct. This difficulty is not limited to human listeners, but also adversely affects the performance of traditional automatic speech recognition (ASR) systems, which in some cases can be completely unusable by the affected individual. This dissertation describes research into improving ASR for speakers with dysarthria by means of incorporated knowledge of their speech production. The document first introduces theoretical aspects of dysarthria and of speech production and outlines related work in these combined areas within ASR. It then describes the acquisition and analysis of the TORGO database of dysarthric articulatory motion and demonstrates several consistent behaviours among speakers in this database, including predictable pronunciation errors, for example. Articulatory data are then used to train augmented ASR systems that model the statistical relationships between vocal tract configurations and their acoustic consequences. I show that dynamic

