Results 1 -
4 of
4
Multi-modal human-machine communication for instructing robot grasping tasks
- In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS
, 2002
"... A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One way of such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for t ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One way of such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for the user, the machine must be capable to establish a common focus of attention and be able to use and integrate spoken instructions, visual perceptions, and non-verbal clues like gestural commands. We report progress in building a hybrid architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by man-machine interaction. The system combines the GRAVIS-robot for visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation, and an modality fusion module to allow multi-modal task-oriented man-machine communication with respect to dextrous robot manipulation of objects. 1
An Integrated System for Cooperative Man-Machine Interaction
, 2001
"... To establish robotic applications in human environments as e.g. offices or private homes the robotic systems must be instructable by ordinary users in a natural way. In interpersonal communication humans usually apply different sensory information and are capable of integrating all perceptual cues f ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
To establish robotic applications in human environments as e.g. offices or private homes the robotic systems must be instructable by ordinary users in a natural way. In interpersonal communication humans usually apply different sensory information and are capable of integrating all perceptual cues fast and consistently. Additionally, knowledge acquired during the communication process is directly used to resolve ambiguities. As a step towards realizing similar capabilities in automatic devices this paper presents an integrated system combining automatic speech processing and image understanding. The system is intended to be an intelligent interface of a robot which manipulates objects in its surroundings according to the instructions of a human. The enhanced capabilities necessary for carrying out a multimodal man-machine dialog are realized by combining statistical and declarative methods for inference and knowledge representation. The effectiveness of this approach is demonstrated using an examplary dialog from our construction task domain.
Learning issues in a multi-modal robot-instruction scenario
- In Proc. IROS, volume Workshop on ”Robot Programming Through Demonstration
, 2003
"... Abstract — One of the challenges for the realization of future intelligent robots is to design architectures which make user instruction of work tasks by interactive demonstration effective and convenient. A key prerequisite for enhancement of robot learning beyond the level of low-level skill acqui ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract — One of the challenges for the realization of future intelligent robots is to design architectures which make user instruction of work tasks by interactive demonstration effective and convenient. A key prerequisite for enhancement of robot learning beyond the level of low-level skill acquisition is situated multi-modal communication. Currently, most existing robot platforms still have to advance to make the development of an integrated learning architecture feasible. We report on the status of the Bielefeld GRAVIS-robot architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by human-machine interaction. It combines visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation and a modality fusion module to allow multi-modal task-oriented communication. It further integrates imitation of human hand postures to allow flexible grasping of every-day objects. With respect to this platform, we sketch the concept of a learning architecture based on several interlocking levels with the goal to demonstrate speech-supported imitation learning of grasping. I.
Using Speech in Visual Object Recognition
- Mustererkennung 2000, 22. DAGM-Symposium Kiel, Informatik Aktuell
, 2000
"... Automatic understanding of multi-modal input is the central topic in modern human computer interfaces. But the basic questions about how the interpretations provided by different modalities can be connected in a universal and robust manner is still an open problem. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Automatic understanding of multi-modal input is the central topic in modern human computer interfaces. But the basic questions about how the interpretations provided by different modalities can be connected in a universal and robust manner is still an open problem.

