Extraction of Visual Features for Lipreading (2002)
| Venue: | IEEE Transactions on Pattern Analysis and Machine Intelligence |
| Citations: | 36 - 1 self |
BibTeX
@ARTICLE{Matthews02extractionof,
author = {Iain Matthews and Tim Cootes and J. Andrew Bangham and Stephen Cox and Richard Harvey},
title = {Extraction of Visual Features for Lipreading},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
year = {2002},
volume = {24},
pages = {2002}
}
Years of Citing Articles
OpenURL
Abstract
The multi-modal nature of speech is often ignored in human-computer interaction but lip deformation, and other body such as head and arm motion all convey additional infor-mation. We integrate speech cues from many sources and this improves intelligibility, es-pecially when the acoustic signal is degraded. This paper shows how this additional, often complementary, visual speech information can be used for speech recognition. Three meth-ods for parameterising lip image sequences for recognition using hidden Markov models are compared. Two of these are top-down approaches that fit a model of the inner and outer lip contours and derive lipreading features from a principal component analysis of shape, or shape and appearance respectively. The third, bottom-up, method uses a non-linear scale-space analysis to form features directly from the pixel intensity. All methods are compared on a multi-talker visual speech recognition task of isolated letters.







