Results 1 - 10
of
23
Extraction of 2d motion trajectories and its application to hand gesture recognition
- PAMI
, 2002
"... AbstractÐWe present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-vie ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
AbstractÐWe present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain two-view correspondences. Affine transformations are computed from each pair of corresponding regions to define pixel matches. Pixels matches over consecutive image pairs are concatenated to obtain pixel-level motion trajectories across the image sequence. Motion patterns are learned from the extracted trajectories using a time-delay neural network. We apply the proposed method to recognize 40 hand gestures of American Sign Language. Experimental results show that motion patterns of hand gestures can be extracted and recognized accurately using motion trajectories. Index TermsÐMotion segmentation, motion analysis, motion trajectory, American Sign Language, hand gesture recognition, time-delay neural network. 1
Design of Virtual Three-dimensional Instruments for Sound Control
, 1998
"... An environment for designing virtual instruments with 3D geometry has been prototyped and applied to real-time sound control and design. It enables a sound artist, musical performer or composer to design an instrument according to preferred or required gestural and musical constraints instead of con ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
An environment for designing virtual instruments with 3D geometry has been prototyped and applied to real-time sound control and design. It enables a sound artist, musical performer or composer to design an instrument according to preferred or required gestural and musical constraints instead of constraints based only on physical laws as they apply to an instrument with a particular geometry. Sounds can be created, edited or performed in real-time by changing parameters like position, orientation and shape of a virtual 3D input device. The virtual instrument can only be perceived through a visualization and acoustic representation, or sonification, of the control surface. No haptic representation is available. This environment was implemented using CyberGloves, Polhemus sensors, an SGI Onyx and by extending a real-time, visual programming language called Max/FTS, which was originally designed for sound synthesis. The extension involves software objects that interface the sensors and so...
Intimacy and Embodiment: Implications for Art and Technology
, 2000
"... People have aesthetic experiences when they manipulate objects skillfully. Highly skilled performance with an object requires forming a highly intimate relationship with it. Aesthetics flow from this intimacy. This paper discusses three works which bring together technology and art to illustrate the ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
People have aesthetic experiences when they manipulate objects skillfully. Highly skilled performance with an object requires forming a highly intimate relationship with it. Aesthetics flow from this intimacy. This paper discusses three works which bring together technology and art to illustrate the issues of intimacy and embodiment. The three works are: Iamascope, video cubism and the forklift ballet.
Mapping Virtual Object Manipulation to Sound Variation
- IPSJ SIG notes
, 1997
"... Abstract: We are studying the use of dexterous manipulation in a virtual environment to create, edit and perform sounds. In our Max/FTS-based experimental environment, a virtual object functions as input device for the editing of sound- the sound artist literally “sculpts ” sounds by changing virtua ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Abstract: We are studying the use of dexterous manipulation in a virtual environment to create, edit and perform sounds. In our Max/FTS-based experimental environment, a virtual object functions as input device for the editing of sound- the sound artist literally “sculpts ” sounds by changing virtual object attributes (shape, position and orientation). We identified three sculpting methods and discuss one for use in physical and abstract mapping strategies. In pilot mapping experiments, the main problems we encountered were the search for interaction metaphors that can quickly be understood, incorporation of manipulation pragmatics in mappings, “touching ” the virtual object and real-time computability. 1
Multimodal Model Integration for Sentence Unit Detection
, 2004
"... In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classification task such that for each inter-word boundary, the classifier decides whether there is ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classification task such that for each inter-word boundary, the classifier decides whether there is an SU boundary or not. In addition to gesture cues, we also utilize prosody and lexical knowledge sources. In this first investigation, we find that gesture features complement the prosodic and lexical knowledge sources for this task. By using all of the knowledge sources, the model is able to achieve the lowest overall SU detection error rate.
Mapping Transparency through Metaphor: Towards More Expressive Musical Instruments
- Organised Sound
, 2003
"... We define a two-axis transparency framework that can be used as a predictor of the expressivity of a musical device. One axis is the player's transparency scale, while the other is the audience's transparency scale. Through consideration of both traditional instrumentation and new technology-driven ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We define a two-axis transparency framework that can be used as a predictor of the expressivity of a musical device. One axis is the player's transparency scale, while the other is the audience's transparency scale. Through consideration of both traditional instrumentation and new technology-driven interfaces, we explore the role that metaphor plays in developing expressive devices. Metaphor depends on a literature, which forms the basis for making transparent device mappings. We examine four examples of systems that use metaphor: Iamascope, Sound Sculpting, MetaMuse, and Glove-TalkII; and discuss implications on transparency and expressivity. We believe this theory provides a framework for design and evaluation of new human-machine and humanhuman interactions, including musical instruments.
Bimanuality in Alternate Musical Instruments
, 2003
"... This paper presents a study of bimanual control applied to sound synthesis. This study deals with coordination, cooperation, and abilities of our hands in musical context. We describe examples of instruments made using subtractive synthesis, scanned synthesis in Max/MSP and commercial stand-alone so ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper presents a study of bimanual control applied to sound synthesis. This study deals with coordination, cooperation, and abilities of our hands in musical context. We describe examples of instruments made using subtractive synthesis, scanned synthesis in Max/MSP and commercial stand-alone software synthesizers via MIDI communication protocol. These instruments have been designed according to a multi-layer-mapping model, which provides modular design. They have been used in concerts and performance considerations are discussed too.
HandySinger: Expressive Singing Voice Morphing Using Personified Hand-puppet Interface
- Proc. NIME2005
, 2005
"... The HandySinger system is a personified tool developed to naturally express a singing voice controlled by the gestures of a hand puppet. Assuming that a singing voice is a kind of musical expression, natural expressions of the singing voice are important for personification. We adopt a singing voice ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
The HandySinger system is a personified tool developed to naturally express a singing voice controlled by the gestures of a hand puppet. Assuming that a singing voice is a kind of musical expression, natural expressions of the singing voice are important for personification. We adopt a singing voice morphing algorithm that e#ectively smoothes out the strength of expressions delivered with a singing voice. The system's hand puppet consists of a glove with seven bend sensors and two pressure sensors. It sensitively captures the user's motion as a personified puppet's gesture. To synthesize the di#erent expressional strengths of a singing voice, the "normal" (without expression) voice of a particular singer is used as the base of morphing, and three di#erent expressions, "dark," "whisper" and "wet," are used as the target. This configuration provides musically expressed controls that are intuitive to users. In the experiment, we evaluate whether 1) the morphing algorithm interpolates expressional strength in a perceptual sense, 2) the handpuppet interface provides gesture data at su#cient resolution, and 3) the gestural mapping of the current system works as planned.
Grassp : Gesturally-realized audio, speech and song performance
- In Proceedings of NIME’06
, 2006
"... We describe the implementation of an environment for Gesturally-Realized Audio, Speech and Song Performance (GRASSP), which includes a glove-based interface, a mapping/training interface, and a collection of Max/MSP/Jitter bpatchers that allow the user to improvise speech, song, sound synthesis, sou ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We describe the implementation of an environment for Gesturally-Realized Audio, Speech and Song Performance (GRASSP), which includes a glove-based interface, a mapping/training interface, and a collection of Max/MSP/Jitter bpatchers that allow the user to improvise speech, song, sound synthesis, sound processing, sound localization, and video processing. The mapping/training interface provides a framework for performers to specify by example the mapping between gesture and sound or video controls. We demonstrate the effectiveness of the GRASSP environment for gestural control of musical expression by creating a gesture-to-voice system that is currently being used by performers.
Gating Improves Neural Network Performance
- In Proc. IEEE Conf. on IJCNN ’01
, 2001
"... In this paper, our first purpose is to study the performance of gating network functions in a committee machine setting. The problem of image deblurring is used to test the capability of such a system. Input clustering divides the task of deblurring into several subtasks. Each subtask is performed b ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this paper, our first purpose is to study the performance of gating network functions in a committee machine setting. The problem of image deblurring is used to test the capability of such a system. Input clustering divides the task of deblurring into several subtasks. Each subtask is performed by a projection pursuit learning network (PPLN) [1]. We use a dynamic gating structure to combine outputs from various committee members. Our second purpose is to study the possibility of extending the role of the input signal beyond the decision making stage in the gating structure. Input data contain crucial structural information and characteristics of the data in a degraded form. The novel aspect of this work is the use of input signal with the output from the gating structure to produce the overall output. Resulting images show significant improvement over images that are produced from the output of the gating structure alone.

