Results 1 -
4 of
4
Dynamical Spectrograms That Can Be Perceived as Visual Gestures
, 1998
"... A new system for speech visualisation, has been implemented to allow deaf and hearing-impaired people to understand verbal information over channels such as the ordinary public telephone system. Incorporating a computational model of the human ear, the system converts incoming sounds into a sequence ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A new system for speech visualisation, has been implemented to allow deaf and hearing-impaired people to understand verbal information over channels such as the ordinary public telephone system. Incorporating a computational model of the human ear, the system converts incoming sounds into a sequence of animated images, which show the temporal variations of the spectral pattern of the input sound in real-time and which are perceived like visual gestures. Preliminary results from forced-choice tests with 28 human subjects are reported, using a sequence of 2- to 4-word sets. To demonstrate the language independence of this approach, some of these were taken from 4 very different languages - English, Persian, French and Czech. The results show high levels of recognition after only 10 learning trials (typical mean scores of 50-85%, where zero represents chance expectation), and encourage further investigation. 1. INTRODUCTION Since its invention, the spectrogram has been the most widely u...
Large vocabulary continuous speech recognition using linguistic features and constraints
, 2005
"... Automatic speech recognition (ASR) is a process of applying constraints, as encoded in the computer system (the recognizer), to the speech signal until ambiguity is satisfactorily resolved to the extent that only one sequence of words is hypothesized. Such constraints fall naturally into two categor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Automatic speech recognition (ASR) is a process of applying constraints, as encoded in the computer system (the recognizer), to the speech signal until ambiguity is satisfactorily resolved to the extent that only one sequence of words is hypothesized. Such constraints fall naturally into two categories. One deals with the the ordering of words (syntax) and organization of their meanings (semantics, pragmatics, etc). The other governs how speech signals are related to words, a process often termed as “lexical access”. This thesis studies the Huttenlocher-Zue lexical access model, its implementation in a modern probabilistic speech recognition framework and its application to continuous speech from an open vocabulary. The Huttenlocher-Zue model advocates a two-pass lexical access paradigm. In the first pass, the lexicon is effectively pruned using broad linguistic constraints. In the original Huttenlocher-Zue model, the authors had proposed six linguistic features motivated by the manner of pronunciation.
Dynamical Spectrogram, An Aid For The Deaf
"... Visual perception of speech through spectrogram reading has long been a subject of research, as an aid for the deaf or hearing impaired. Attributing the lack of success in this type of visual aids mainly to the static form of information presented by the spectrograms, this paper proposes a system of ..."
Abstract
- Add to MetaCart
Visual perception of speech through spectrogram reading has long been a subject of research, as an aid for the deaf or hearing impaired. Attributing the lack of success in this type of visual aids mainly to the static form of information presented by the spectrograms, this paper proposes a system of dynamic visualisation for speech sounds. This system samples a high resolved, auditorybased spectrogram, with a window of 20 milliseconds duration, so that exploiting the periodicity of the input sound, it produces a phase-locked sequence of images. This sequence is then animated at a rate of 50 images per second to produce a movie-like image displaying both the time-varying and time-independent information of the underlying sound. Results of several preliminary experiments for evaluation of the potential usefulness of the system for the deaf, undertaken by normal-hearing subjects, support the quick learning and persistence of the gestures for small sets of single words and motivate further...

