Results 1 - 10
of
13
The Visual Analysis of Human Movement: A Survey
- Computer Vision and Image Understanding
, 1999
"... The ability to recognize humans and their activities by vision is key for a machine to interact intelligently and effortlessly with a human-inhabited environment. Because of many potentially important applications, “looking at people ” is currently one of the most active application domains in compu ..."
Abstract
-
Cited by 456 (7 self)
- Add to MetaCart
The ability to recognize humans and their activities by vision is key for a machine to interact intelligently and effortlessly with a human-inhabited environment. Because of many potentially important applications, “looking at people ” is currently one of the most active application domains in computer vision. This survey identifies a number of promising applications and provides an overview of recent developments in this domain. The scope of this survey is limited to work on whole-body or hand motion; it does not include work on human faces. The emphasis is on discussing the various methodologies; they are grouped in 2-D approaches with or without explicit shape models and 3-D approaches. Where appropriate, systems are reviewed. We conclude with some thoughts about future directions. c ○ 1999 Academic Press 1.
Real-time american sign language recognition using desk and wearable computer based video
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1998
"... We present two real-time hidden Markov model-based systems for recognizing sentence-level continuous American Sign Language (ASL) using a single camera to track the user’s unadorned hands. The first system observes the user from a desk mounted camera and achieves 92 percent word accuracy. The secon ..."
Abstract
-
Cited by 367 (20 self)
- Add to MetaCart
We present two real-time hidden Markov model-based systems for recognizing sentence-level continuous American Sign Language (ASL) using a single camera to track the user’s unadorned hands. The first system observes the user from a desk mounted camera and achieves 92 percent word accuracy. The second system mounts the camera in a cap worn by the user and achieves 98 percent accuracy (97 percent with an unrestricted grammar). Both experiments use a 40-word lexicon.
Visual Recognition of American Sign Language Using Hidden Markov Models
, 1995
"... Using hidden Markov models (HMM's), an unobstrusive single view camera system is developed that can recognize hand gestures, namely, a subset of American Sign Language (ASL). Previous systems have concentrated on finger spelling or isolated word recognition, often using tethered electronic gloves fo ..."
Abstract
-
Cited by 240 (14 self)
- Add to MetaCart
Using hidden Markov models (HMM's), an unobstrusive single view camera system is developed that can recognize hand gestures, namely, a subset of American Sign Language (ASL). Previous systems have concentrated on finger spelling or isolated word recognition, often using tethered electronic gloves for input. We achieve high recognition rates for full sentence ASL using only visual cues. A forty word lexicon consisting of personal pronouns, verbs, nouns, and adjectives is used to create 494 randomly constructed five word sentences that are signed by the subject to the computer. The data is separated into a 395 sentence training set and an independent 99 sentence test set. While signing, the 2D position, orientation, and eccentricity of bounding ellipses of the hands are tracked in real time with the assistance of solidly colored gloves. Simultaneous recognition and segmentation of the resultant stream of feature vectors occurs five times faster than real time on an HP 735. With a strong ...
A Wearable Computer Based American Sign Language Recognizer
, 1997
"... Modern wearable computer designs package workstation level performance in systems small enough to be worn as clothing. These machines enable technology to be brought where it is needed the most for the handicapped: everyday mobile environments. This paper de- scribes a research effort to make a wear ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
Modern wearable computer designs package workstation level performance in systems small enough to be worn as clothing. These machines enable technology to be brought where it is needed the most for the handicapped: everyday mobile environments. This paper de- scribes a research effort to make a wearable computer that can recognize (with the possible goal of translat- ing) sentence level American Sign Language (ASL) using only a baseball cap mounted camera for input. Current accuracy exceeds 97% per word on a 40 word lexicon.
GRASP: Recognition of Australian Sign Language Using Instrumented Gloves
, 1995
"... Instrumented gloves -- gloves equipped with sensors for detecting finger bend, hand position and orientation -- were conceived to allow a more natural interface to computers. However, the extension of their use for recognising sign language, and in this case Auslan (Australian Sign Language), is pos ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
Instrumented gloves -- gloves equipped with sensors for detecting finger bend, hand position and orientation -- were conceived to allow a more natural interface to computers. However, the extension of their use for recognising sign language, and in this case Auslan (Australian Sign Language), is possible. Several researchers have already explored these possibilities and have successfully achieved finger-spelling recognition with high levels of accuracy, but progress in the recognition of sign language as a whole has been limited.
Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Language
- Proceedings of the Workshop on the Integration of Gesture in Language and Speech
, 1996
"... Instrumented gloves use a variety of sensors to provide information about the user's hand. They can be used for recognition of gestures; especially well-defined gesture sets such as sign languages. However, recognising gestures is a difficult task, due to intrapersonal and interpersonal variations i ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Instrumented gloves use a variety of sensors to provide information about the user's hand. They can be used for recognition of gestures; especially well-defined gesture sets such as sign languages. However, recognising gestures is a difficult task, due to intrapersonal and interpersonal variations in performing them. One approach to solving this problem is to use machine learning. In this case, samples of 95 discrete Australian Sign Language (Auslan) signs were collected using a PowerGlove. Two machine learning techniques were applied -- instance-based learning (IBL) and decision-tree learning -- to the data after some simple features were extracted. Accuracy of approximately 80 per cent was achieved using IBL, despite the severe limitations of the glove. Introduction Sign language recognition is interesting for a number reasons; it represents an interesting domain in itself with obvious real-world applications, but it also makes a good starting point for gesture recognition in genera...
Visual interpretation for hand gestures as a practical interface modality
- Columbia University
, 1997
"... This dissertation describes a user interface in which many tasks traditionally performed by a mouse are instead performed using visual recognition of hand gestures. The goals are to explore both how a vision system should be designed to recognize hand gestures, and how they are best used in a genera ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
This dissertation describes a user interface in which many tasks traditionally performed by a mouse are instead performed using visual recognition of hand gestures. The goals are to explore both how a vision system should be designed to recognize hand gestures, and how they are best used in a general purpose interface. Observed by a camera below the screen, the user manipulates objects directly with gestures incorporating both motion and pose. Task and domain knowledge provide context, allowing real-time recognition on standard PC hardware. A color-based algorithm is trained to segment user's hands from complex backgrounds without visual aids. Training uses a novel combination of both positive and negative data to improve segmentation quality. The apparent path of the hand is smoothed with an algorithm which reduces the types of noise inherent in the domain but leaves a cursor motion on the screen that feels natural for the user. Salient features of the motion are extracted, including a newly discovered natural gesture (a “Comma”), which helps provide punctuation for each gestural sentence. Neural networks are trained to classify the pose of the user's hand from cropped and preprocessed images. The nets correctly classify 90-95 % of the hand images in real time. A transition network encodes the interaction language. It controls the application of feature extraction operators and interprets their results to determine when to perform actions on the user's behalf. The style of interaction is based on studies of natural gesticulation and incorporates various features designed to make it natural and easy for the user to remember. The system demonstrates a 80-90 % success rate on most tasks. Object selection time for large objects is demonstrated to be equal or superior to that of a mouse. Object selection performance is modeled accurately by augmenting Fitts ' Law with terms for lag and random cursor noise. Finally, the suitability of gesture for this type of task is considered. Various interaction styles are examined, and problems specific to hand gesture are discussed.
Vision-based 3-D tracking of humans in action
, 1996
"... The ability to recognize humans and their activities by visionisessential for future machines to interact intelligently and e ortlessly with a human-inhabited environment. Some of the more promising applications are discussed. A prototype vision system is presented for the tracking of whole-body mov ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The ability to recognize humans and their activities by visionisessential for future machines to interact intelligently and e ortlessly with a human-inhabited environment. Some of the more promising applications are discussed. A prototype vision system is presented for the tracking of whole-body movement using multiple cameras. 3-D body pose is recovered at each time instant based on occluding contours. The pose-recovery problem is formulated as a search problem and entails nding the pose parameters of a graphical human model whose synthesized appearance is most similar to the actual appearance of the real human in the multi-view images. Hermite deformable contours are proposed as a tool for the 2-D contour tracking problem. The main contribution of this dissertation is that it demonstrates for the rst time a set of techniques that allow accurate vision-based 3-D tracking of arbitrary whole-body movement without the use of markers.
Tracking and Analysis of Articulated Motion with an Application to Human Motion
, 2000
"... Articulated motion is a subset of non-rigid motion in which the object of interest is composed of several rigid components connected to each other by ball and hinge joints. The human body, many animals and insects, and machinery all exhibit such motion. This dissertation addresses the problem of vis ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Articulated motion is a subset of non-rigid motion in which the object of interest is composed of several rigid components connected to each other by ball and hinge joints. The human body, many animals and insects, and machinery all exhibit such motion. This dissertation addresses the problem of vision-based tracking and analysis of this type of motion. The importance of this problem can be seen in many application domains including surveillance, traffic monitoring, entertainment, user interfaces, medicine, sports, video annotation, and image compression. This dissertation deals with two important subproblems of the general problem: whole-body tracking and motion recognition. In whole-body tracking, the body is tracked as one unit without paying attention to the details of the posture and limbs. Current solutions to this problem suffer from being too sensitive to small changes in the environment. We present a novel approach which reduces these restrictions significantly. This is achieved by separating the concepts of a blob from that of a body and by tracking each independently while maintaining a many-to-many relationship between the two. The approach makes use of the Extended Kalman Filter and outputs trajectory information in world coordinates. The method was tested by tracking pedestrians in a variety of environments and achieved real-time performance and a high degree of robustness. Motion recognition is the high level problem of classifying an action taking place in a video sequence into one of several action categories. Most of the present approaches attempt to perform three-dimensional reconstruction of the articulated shape prior to recognition, which is an inherently difficult problem made even more difficult due to the nonrigidity of the articulated object. W...
Visual Recognition of Hand Motion
, 1997
"... Hand gesture recognition is an active area of research in recent years, being used in various applications from deaf sign recognition systems to humanmachine interaction applications. The gesture recognition process, in general, may be divided into two stages: the motion sensing, which extracts usef ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Hand gesture recognition is an active area of research in recent years, being used in various applications from deaf sign recognition systems to humanmachine interaction applications. The gesture recognition process, in general, may be divided into two stages: the motion sensing, which extracts useful data from hand motion; and the classification process, which classifies the motion sensing data as gestures. The existing vision-based gesture recognition systems extract 2-D shape and trajectory descriptors from the visual input, and classify them using various classification techniques from maximum likelihood estimation to neural networks, finite state machines, Fuzzy Associative Memory (FAM) or Hidden Markov Models (HMMs). This thesis presents the framework of the vision-based Hand Motion Understanding (HMU) system that recognises static and dynamic Australian Sign Language (Auslan) signs by extracting and classifying 3-D hand configuration data from the visual input. The HMU system is...

