Results 1 -
4 of
4
Learning grasp strategies with partial shape information
- in AAAI
, 2008
"... We consider the problem of grasping novel objects in cluttered environments. If a full 3-d model of the scene were available, one could use the model to estimate the stability and robustness of different grasps (formalized as form/force-closure, etc); in practice, however, a robot facing a novel obj ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
We consider the problem of grasping novel objects in cluttered environments. If a full 3-d model of the scene were available, one could use the model to estimate the stability and robustness of different grasps (formalized as form/force-closure, etc); in practice, however, a robot facing a novel object will usually be able to perceive only the front (visible) faces of the object. In this paper, we propose an approach to grasping that estimates the stability of different grasps, given only noisy estimates of the shape of visible portions of an object, such as that obtained from a depth sensor. By combining this with a kinematic description of a robot arm and hand, our algorithm is able to compute a specific positioning of the robot’s fingers so as to grasp an object. We test our algorithm on two robots (with very different arms/manipulators, including one with a multi-fingered hand). We report results on the task of grasping objects of significantly different shapes and appearances than ones in the training set, both in highly cluttered and in uncluttered environments. We also apply our algorithm to the problem of unloading items from a dishwasher.
Feccm for scene understanding: Helping the robot to learn multiple tasks
- In Video contribution in ICRA
, 2011
"... Abstract — Helping a robot to understand a scene can include many sub-tasks, such as scene categorization, object detection, geometric labeling, etc. Each sub-task is notoriously hard, and state-of-art classifiers exist for many sub-tasks. It is desirable to have an algorithm that can capture such c ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract — Helping a robot to understand a scene can include many sub-tasks, such as scene categorization, object detection, geometric labeling, etc. Each sub-task is notoriously hard, and state-of-art classifiers exist for many sub-tasks. It is desirable to have an algorithm that can capture such correlation without requiring to make any changes to the inner workings of any classifier, and therefore make the perception for a robot better. We have recently proposed a generic model (Feedback Enabled Cascaded Classification Model) that enables us to easily take state-of-art classifiers as black-boxes and improve performance. In this video, we show that we can use our FeCCM model to quickly combine existing classifiers for various sub-tasks, and build a shoe finder robot in a day. The video shows our robot using FeCCM to find a shoe on request. I.
Human activity detection from RGBD images
- In AAAI workshop on Pattern, Activity and Intent Recognition (PAIR
, 2011
"... Being able to detect and recognize human activities is important for making personal assistant robots useful in performing assistive tasks. The challenge is to develop a system that is low-cost, reliable in unstructured home settings, and also straightforward to use. In this paper, we use a RGBD sen ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Being able to detect and recognize human activities is important for making personal assistant robots useful in performing assistive tasks. The challenge is to develop a system that is low-cost, reliable in unstructured home settings, and also straightforward to use. In this paper, we use a RGBD sensor (Microsoft Kinect) as the input sensor, and present learning algorithms to infer the activities. Our algorithm is based on a hierarchical maximum entropy Markov model (MEMM). It considers a person’s activity as composed of a set of sub-activities, and infers the two-layered graph structure using a dynamic programming approach. We test our algorithm on detecting and recognizing twelve different activities performed by four people in different environments, such as a kitchen, a living room, an office, etc., and achieve an average performance of 84.3% when the person was seen before in the training set (and 64.2 % when the person was not seen before).
COMPUTATIONAL AND COGNITIVE VISION SYSTEMS: A TRAINING EUROPEAN NETWORK VISIONTRAIN
"... 2 Brief review of the state of the art 2 ..."

