Results 1 - 10
of
17
Gesture Recognition
"... Introduction A primary goal of virtual environments is to support natural, efficient, powerful, and flexible interaction. If the interaction technology is overly obtrusive, awkward, or constraining, the user's experience with the synthetic environment is severely degraded. If the interaction itself ..."
Abstract
-
Cited by 2223 (28 self)
- Add to MetaCart
Introduction A primary goal of virtual environments is to support natural, efficient, powerful, and flexible interaction. If the interaction technology is overly obtrusive, awkward, or constraining, the user's experience with the synthetic environment is severely degraded. If the interaction itself draws attention to the technology, rather than the task at hand, or imposes a high cognitive load on the user, it becomes a burden and an obstacle to a successful virtual environment experience. The traditional two-dimensional, keyboard- and mouse-oriented graphical user interface (GUI) is not well-suited for virtual environments. Instead, synthetic environments provide the opportunity to utilize several different sensing modalities and technologies and integrate them into the user experience. Devices which sense body position and orientation, direction of gaze, speech and sound, facial expression, galvanic skin response, and other aspects of human behavior or state can be used to mediate c
Hand Gesture Recognition using Multi-Scale Colour Features, Hierarchical Models and Particle Filtering
- Proc. Face and Gesture
, 2002
"... This paper presents algorithms and a prototype system for hand tracking and hand posture recognition. Hand postures are represented in terms of hierarchies of multi-scale colour image features at different scales, with qualitative inter-relations in terms of scale, position and orientation. In each ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
This paper presents algorithms and a prototype system for hand tracking and hand posture recognition. Hand postures are represented in terms of hierarchies of multi-scale colour image features at different scales, with qualitative inter-relations in terms of scale, position and orientation. In each image, detection of multi-scale colour features is performed. Hand states are then simultaneously detected and tracked using particle filtering, with an extension of layered sampling referred to as hierarchical layered sampling. Experiments are presented showing that the performance of the system is substantially improved by performing feature detection in colour space and including a prior with respect to skin colour. These components have been integrated into a real-time prototype system, applied to a test problem of controlling consumer electronics using hand gestures. In a simplified demo scenario, this system has been successfully tested by participants at two fairs during 2001.
Object Recognition with Multiple Feature Types
- In ICANN'98, Proceedings of the 8th International Conference on Artificial Neural Networks
, 1998
"... One of the brain's recipes for robustly perceiving the world is to integrate multiple feature types such as shape, color, texture and motion. We have investigated how far also neural-network based object recognition can profit from the combination of several feature types. For this purpose we have e ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
One of the brain's recipes for robustly perceiving the world is to integrate multiple feature types such as shape, color, texture and motion. We have investigated how far also neural-network based object recognition can profit from the combination of several feature types. For this purpose we have extended Elastic Graph Matching such that several feature types may be combined in the object models. We applied the system in two difficult application domains, the interpretation of cluttered scenes and the recognition of hand postures against complex backgrounds. Our results demonstrate that the usage of additional feature types significantly improves performance. 1 Introduction Vision is a hard problem which our brains solve very well. The neurons in visual cortex extract different features of the input image. Some represent shape, others motion or color or combinations of these. These features have to be integrated to form object descriptions which can be stored and recognized. In compu...
ORASSYLL: Object Recognition with Autonomously Learned and Sparse Symbolic Representations Based on Metrically Organized Local Line Detectors (Object Recognition with ORASSYLL)
, 2000
"... We introduce an object recognition and localization system in which objects are represented as a sparse and spatially organized set of local (bent) line segments. ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We introduce an object recognition and localization system in which objects are represented as a sparse and spatially organized set of local (bent) line segments.
Correspondence Using Distinct Points Based on Image Invariants
, 1997
"... We present a method, based on the idea of distinctive points, for locating point correspondences between two images of similar objects independently of scale, orientation and position. Distinctive points are those which have a low probability of being mistaken with other points, and therefore are ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present a method, based on the idea of distinctive points, for locating point correspondences between two images of similar objects independently of scale, orientation and position. Distinctive points are those which have a low probability of being mistaken with other points, and therefore are more likely to be correctly located in a similar image. The local image structure at each image point is described by vectors of Cartesian differential invariants computed at a range of scales. Distinctive points lie in low density regions of the distribution of all vectors of invariants found in an image. The vectors of invariants of distinct points are used to locate similar points in a second image. Results of applying this technique to find correspondences between images of faces are shown. 1 Introduction A common problem in computer vision is that of establishing correspondences between images of similar objects. For images of objects with fixed 3D geometry 5 correspondences a...
The Living Machine Initiative
- Department of Computer Science, Michigan State University, East Lansing
, 1996
"... While digital multimedia are entering all walks of life,breakthroughs in machine understanding of multimodal information such as video, images, speech, language, and v arious forms of hand-written or mix-printed text, can lead to numerous applications that will signi#cantly expand the application ba ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
While digital multimedia are entering all walks of life,breakthroughs in machine understanding of multimodal information such as video, images, speech, language, and v arious forms of hand-written or mix-printed text, can lead to numerous applications that will signi#cantly expand the application base of computer technology, and improvehuman life,scienti#c and engineering research,education,and human resource base. However, machine understanding of multimodal information in its general form proves to be an extremely challenging task facing the researchcommunitytoday, despite the fast and sustained advance of computers in their speed,storage capacity,performance-to-price ratio,and installation base. The principal investigator #PI# has been investigating persisting di#culties encountered by the existing basic methodology --- manually-modeling-knowledge and spoonfeeding-knowledge #MMKSK#. Researchers in each sub#eld have been manually developing knowledge-level theories and methods, and u...
The catchment feature model: A device for multimodal fusion and a bridge between signal and sense
- In Review: EURASIP JASP
, 2002
"... goes to our extended research team, especially David McNeill, a friend and colleague, upon whose psycholinguistic research this work is based. The Catchment Feature Model addresses two questions in the field of multimodal interaction: how do we bridge video and audio processing with the realities of ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
goes to our extended research team, especially David McNeill, a friend and colleague, upon whose psycholinguistic research this work is based. The Catchment Feature Model addresses two questions in the field of multimodal interaction: how do we bridge video and audio processing with the realities of human multimodal communication, and how information from the different modes may be fused. We argue from a detailed literature review that gestural research has clustered around manipulative and semaphoric use of the hands, motivate the Catchment Feature Model psycholinguistic research, present the Model. In contrast to ‘whole gesture ’ recognition, the Catchment Feature Model applies a feature decomposition ap-proach that facilitates cross-modal fusion at the level of discourse planning and conceptualization. We present our experimental framework for catchment feature-based research, and cite three con-crete examples of Catchment Features, and propose new directions of multimodal research based on the model. 1
Robust FACE detection and Japanese Sign Language hand posture recognition for human-computer interaction
- the Fifteenth International Conference on Vision Interface
, 2002
"... A system for the detection of human faces and for the classification of hand postures of the Japanese Sign Language in color images inside an "intelligent " room is presented. We first propose to apply a combination of a skin chrominance-based image segmentation with a color vector gradien ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
A system for the detection of human faces and for the classification of hand postures of the Japanese Sign Language in color images inside an "intelligent " room is presented. We first propose to apply a combination of a skin chrominance-based image segmentation with a color vector gradient-based edge detection [1] [2] to efficiently detect faces and hands. Within the framework of a general approach, a statistical model for face detection based on invariant moments [3] [4] is used to discriminate between faces and hands in the segmented images. A novel approach to hand posture recognition based on phase-only correlation [5] is then adopted to classify a subset of static hand postures of the Japanese Sign Language, each posture representing a given phoneme, and also to discriminate between hand postures and the image scene background. Experiments show that the additional use of the color vector gradient significantly improves the correct rate of face detection, and that the phase-only correlation filter yields a high rate of discrimination between different static hand postures as well as between hand postures and the scene background. Ultimately, the system is to contribute to the implementation of meaningful humanmachine interactions in a room that we are in the process of establishing, the “percept-room”, mainly for welfare applications. 1.
Vision And Learning For Intelligent Human-Computer Interaction
, 2001
"... It was a dream to make computers see. The research in computer vision provides promising technologies to capture, analyze, transmit, retrieve and interpret visual information. However, due to the richness and large variations in the visual inputs, the practice of many statistical learning techniques ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
It was a dream to make computers see. The research in computer vision provides promising technologies to capture, analyze, transmit, retrieve and interpret visual information. However, due to the richness and large variations in the visual inputs, the practice of many statistical learning techniques for visual motion capturing and recognition are confronted by some similar problems, such that making intelligent and visually capable machines is still a challenging task. This dissertation concentrates on two important problems: capturing and recognizing human motion in video sequences, which are crucial for the research and applications of intelligent human computer interaction, multimedia communication, and smart environments.

