Results 1 - 10
of
18
Enhanced computer vision with microsoft kinect sensor: A review
- IEEE TRANSACTIONS ON CYBERNETICS
, 2013
"... With the invention of the low-cost Microsoft Kinect sensor, high-resolution depth and visual (RGB) sensing has become available for widespread use. The complementary nature of the depth and visual information provided by the Kinect sensor opens up new opportunities to solve fundamental problems in ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
(Show Context)
With the invention of the low-cost Microsoft Kinect sensor, high-resolution depth and visual (RGB) sensing has become available for widespread use. The complementary nature of the depth and visual information provided by the Kinect sensor opens up new opportunities to solve fundamental problems in computer vision. This paper presents a comprehensive review of recent Kinect-based computer vision algorithms and applications. The reviewed approaches are classified according to the type of vision problems that can be addressed or enhanced by means of the Kinect sensor. The covered topics include preprocessing, object tracking and recognition, human activity analysis, hand gesture analysis, and indoor 3-D mapping. For each category of methods, we outline their main algorithmic contributions and summarize their advantages/differences compared to their RGB counterparts. Finally, we give an overview of the challenges in this field and future research trends. This paper is expected to serve as a tutorial and source of references for Kinect-based computer vision researchers.
Tracking Revisited using RGBD Camera: Unified Benchmark and Baselines
"... Despite significant progress, tracking is still considered to be a very challenging task. Recently, the increased popularity of depth sensors has made it possible to obtain reliable depth easily. This may be a game changer for tracking, since depth can be used to prevent model drift and handle occlu ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Despite significant progress, tracking is still considered to be a very challenging task. Recently, the increased popularity of depth sensors has made it possible to obtain reliable depth easily. This may be a game changer for tracking, since depth can be used to prevent model drift and handle occlusion. We also observe that current tracking algorithms are mostly evaluated on a very small number of videos collected and annotated by different groups. The lack of a reasonable size and consistently constructed benchmark has prevented a persuasive comparison among different algorithms. In this paper, we construct a unified benchmark dataset of 100 RGBD videos with high diversity, propose different kinds of RGBD tracking algorithms using 2D or 3D model, and present a quantitative comparison of various algorithms with RGB or RGBD input. We aim to lay the foundation for further research in both RGB and RGBD tracking, and will make our dataset as well as evaluation server available online. 1.
Tracking people within groups with rgb-d data
- In Proc. of the International Conference on Intelligent Robots and Systems (IROS
, 2012
"... Abstract-This paper proposes a very fast and robust multi-people tracking algorithm suitable for mobile platforms equipped with a RGB-D sensor. Our approach features a novel depth-based sub-clustering method explicitly designed for detecting people within groups or near the background and a three-t ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
(Show Context)
Abstract-This paper proposes a very fast and robust multi-people tracking algorithm suitable for mobile platforms equipped with a RGB-D sensor. Our approach features a novel depth-based sub-clustering method explicitly designed for detecting people within groups or near the background and a three-term joint likelihood for limiting drifts and ID switches. Moreover, an online learned appearance classifier is proposed, that robustly specializes on a track while using the other detections as negative examples. Tests have been performed with data acquired from a mobile robot in indoor environments and on a publicly available dataset acquired with three RGB-D sensors and results have been evaluated with the CLEAR MOT metrics. Our method reaches near state of the art performance and very high frame rates in our distributed ROS-based CPU implementation.
FollowMe: Person Following and Gesture Recognition with a Quadrocopter
"... Abstract — In this paper, we present an approach that allows a quadrocopter to follow a person and to recognize simple gestures using an onboard depth camera. This enables novel applications such as hands-free filming and picture taking. The problem of tracking a person with an onboard camera howeve ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Abstract — In this paper, we present an approach that allows a quadrocopter to follow a person and to recognize simple gestures using an onboard depth camera. This enables novel applications such as hands-free filming and picture taking. The problem of tracking a person with an onboard camera however is highly challenging due to the self-motion of the platform. To overcome this problem, we stabilize the depth image by warping it to a virtual-static camera, using the estimated pose of the quadrocopter obtained from vision and inertial sensors using an Extended Kalman filter. We show that such a stabilized depth video is well suited to use with existing person trackers such as the OpenNI tracker. Using this approach, the quadrocopter not only obtains the position and orientation of the tracked person, but also the full body pose – which can then for example be used to recognize hand gestures to control the quadrocopter’s behaviour. We implemented a small set of example commands (“follow me”, “take picture”, “land”), and generate corresponding motion commands. We demonstrate the practical performance of our approach in an extensive set of experiments with a quadrocopter. Although our current system is limited to indoor environments and small motions due to the restrictions of the used depth sensor, it indicates that there is large potential for such applications in the near future. I.
SUN RGB-D: A RGBD scene understanding benchmark suite
- In CVPR
, 2015
"... Although RGB-D sensors have enabled major break-throughs for several vision tasks, such as 3D reconstruc-tion, we have not attained the same level of success in high-level scene understanding. Perhaps one of the main rea-sons is the lack of a large-scale benchmark with 3D anno-tations and 3D evaluat ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Although RGB-D sensors have enabled major break-throughs for several vision tasks, such as 3D reconstruc-tion, we have not attained the same level of success in high-level scene understanding. Perhaps one of the main rea-sons is the lack of a large-scale benchmark with 3D anno-tations and 3D evaluation metrics. In this paper, we intro-duce an RGB-D benchmark suite for the goal of advancing the state-of-the-arts in all major scene understanding tasks. Our dataset is captured by four different sensors and con-tains 10,335 RGB-D images, at a similar scale as PASCAL VOC. The whole dataset is densely annotated and includes 146,617 2D polygons and 64,595 3D bounding boxes with accurate object orientations, as well as a 3D room layout and scene category for each image. This dataset enables us to train data-hungry algorithms for scene-understanding tasks, evaluate them using meaningful 3D metrics, avoid overfitting to a small testing set, and study cross-sensor bias. 1.
Real-Time Multiple Human Perception with Color-Depth Cameras on a Mobile Robot
"... Abstract—The ability to perceive humans is an essential re-quirement for safe and efficient human-robot interaction. In real-world applications, the need for a robot to interact in real time with multiple humans in a dynamic, 3-D environment presents a significant challenge. The recent availability ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Abstract—The ability to perceive humans is an essential re-quirement for safe and efficient human-robot interaction. In real-world applications, the need for a robot to interact in real time with multiple humans in a dynamic, 3-D environment presents a significant challenge. The recent availability of commercial color-depth cameras allow for the creation of a system that makes use of the depth dimension, thus enabling a robot to observe its environment and perceive in the 3-D space. Here we present a system for 3-D multiple human perception in real time from a moving robot equipped with a color-depth camera and a consumer-grade computer. Our approach reduces computation time to achieve real-time performance through a unique combination of new ideas and established techniques. We remove the ground and ceiling planes from the 3-D point cloud input to separate candidate point clusters. We introduce the novel information concept, depth of interest, which we use to identify candidates for detection, and that avoids the computationally expensive scanning-window methods of other approaches. We utilize a cascade of detectors to distinguish humans from objects, in which we make intelligent reuse of intermediary features in successive detectors to improve computation. Because of the high computational cost of some methods, we represent our candidate tracking algorithm with a decision directed acyclic graph, which allows us to use the most computationally intense techniques only where necessary. We detail the successful implementation of our novel approach on a mobile robot and examine its performance in scenarios with real-world challenges, including occlusion, robot motion, nonupright humans, humans leaving and reentering the field of view (i.e., the reidentification challenge), human-object and human-human interaction. We conclude with the observation that the incorporation of the depth information, together with the use of modern techniques in new ways, we are able to create an accurate system for real-time 3-D perception of humans by a mobile robot. Index Terms—3-D vision, depth of interest, human detection and tracking, human perception, RGB-D camera application.
Scene in the Loop: Towards Adaptation-by-Tracking in RGB-D Data
"... Abstract—This paper addresses the problem of adapting an existing object detector to the characteristics of the environment in an unsupervised manner. The technique aims to reject all the false positive detections by exploiting the information from the environment and from the tracking system. We fo ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—This paper addresses the problem of adapting an existing object detector to the characteristics of the environment in an unsupervised manner. The technique aims to reject all the false positive detections by exploiting the information from the environment and from the tracking system. We follow the intuition that similar characteristics are shared among the objects that are present in the same scene. Our aim is to detect the false positives by analyzing which detections do not share common properties in RGB-D feature space. For this, we make use of a One-class SVM in an unsupervised manner. This idea allows our approach to adapt to the environment it is tracking in. We developed and evaluated our system based on a people detection and tracking system that operates on Kinect data. Our experimental evaluation shows that our method outperforms standard outlier detection techniques and that is able to remove over 50 % of the false positives without eliminating a significant amount of correct detections. I.
OpenPTrack: People Tracking for Heterogeneous Networks of Color-Depth Cameras
"... Abstract. This paper introduces OpenPTrack, an open source software for multi-camera people tracking in RGB-D camera networks. OpenPTrack provides real-time people detection and tracking algorithms from 3D data coming from Microsoft Kinect and Mesa SwissRanger. The software is able to track people a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. This paper introduces OpenPTrack, an open source software for multi-camera people tracking in RGB-D camera networks. OpenPTrack provides real-time people detection and tracking algorithms from 3D data coming from Microsoft Kinect and Mesa SwissRanger. The software is able to track people at 30 Hz with minimum latency. A user-friendly cal-ibration procedure is also provided, so that the camera network can be calibrated in few seconds by moving a checkerboard in front of the cam-eras and seeing the calibration results in real time. The algorithms for people detection are executed in a distributed fashion for every sensor, while tracking is done by a single node which takes into account detec-tions from all over the network. Algorithms based on RGB or depth are automatically enabled while the system is running, depending on the luminance properties of the image. OpenPTrack is based on the Robot Operating System and the Point Cloud Library and has been tested on networks composed of up to six sensors.
The role of rgb-d benchmark datasets: an overview. arXiv preprint arXiv:1310.2053
, 2013
"... The advent of the Microsoft Kinect three years ago stim-ulated not only the computer vision community for new algorithms and setups to address well-known problems in the community but also sparked the launch of several new benchmark datasets to which future algorithms can be com-pared to. This revie ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The advent of the Microsoft Kinect three years ago stim-ulated not only the computer vision community for new algorithms and setups to address well-known problems in the community but also sparked the launch of several new benchmark datasets to which future algorithms can be com-pared to. This review of the literature and industry de-velopments concludes that the current RGB-D benchmark datasets can be useful to determine the accuracy of a vari-ety of applications of a single or multiple RGB-D sensors. 1.
A Multimodal Person-following System for Telepresence Applications Wee Ching Pang∗
"... This paper presents the design and implementation of a multimodal person-following system for a mobile telepresence robot. A color histogram matching and position matching algorithm was devel-oped for a person-recognition function using Kinect sensors. Robot motion was controlled by adjusting its ve ..."
Abstract
- Add to MetaCart
This paper presents the design and implementation of a multimodal person-following system for a mobile telepresence robot. A color histogram matching and position matching algorithm was devel-oped for a person-recognition function using Kinect sensors. Robot motion was controlled by adjusting its velocity according to the humans position in relation to the robot. The robot was able to follow the targeted person in various person-following modes, such as the back-following mode, the side-by-side accompaniment mode as well as the front-guiding mode. An obstacle avoidance function was also implemented using the virtual potential field algorithm.