Results 1 - 10
of
36
Fast Keypoint Recognition using Random Ferns. PAMI, 2009. Accepted for Publication
"... While feature point recognition is a key component of modern approaches to object detection, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. In this paper, we show that formulating the problem in a Naive Bayesian classification framework ma ..."
Abstract
-
Cited by 22 (5 self)
- Add to MetaCart
While feature point recognition is a key component of modern approaches to object detection, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. In this paper, we show that formulating the problem in a Naive Bayesian classification framework makes such preprocessing unnecessary and produces an algorithm that is simple, efficient, and robust. Furthermore, it scales well as number of classes grows. To recognize the patches surrounding keypoints, our classifier uses hundreds of simple binary features and models class posterior probabilities. We make the problem computationally tractable by assuming independence betweenarbitrarysetsoffeatures.Eventhoughthisisnotstrictlytrue,wedemonstratethatourclassifier nevertheless performs remarkably well on image datasets containing very significant perspective changes. Index Terms Image processing and computer vision, object recognition, tracking, image registration, feature matching, naive bayesian 1 I.
Multiple Target Localisation at over 100 FPS
, 2009
"... This paper presents a method for fast feature-based matching which enables 7 independent targets to be localised in a video sequence with an average total processing time of 7.46ms per frame. We extend recent work [14] on fast matching using Histogrammed Intensity Patches (HIPs) by adding a rotation ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
This paper presents a method for fast feature-based matching which enables 7 independent targets to be localised in a video sequence with an average total processing time of 7.46ms per frame. We extend recent work [14] on fast matching using Histogrammed Intensity Patches (HIPs) by adding a rotation invariant framework and a treebased lookup scheme. Compared to state-of-the-art fast localisation schemes [15] we achieve better matching robustness in under a quarter of the computation time and requiring 5-10 times less memory.
SURFTrac: Efficient Tracking and Continuous Object Recognition using Local Feature Descriptors
- In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR09
, 2009
"... We present an efficient algorithm for continuous image recognition and feature descriptor tracking in video which operates by reducing the search space of possible interest points inside of the scale space image pyramid. Instead of performing tracking in 2D images, we search and match candidate feat ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
We present an efficient algorithm for continuous image recognition and feature descriptor tracking in video which operates by reducing the search space of possible interest points inside of the scale space image pyramid. Instead of performing tracking in 2D images, we search and match candidate features in local neighborhoods inside the 3D image pyramid without computing their feature descriptors. The candidates are further validated by fitting to a motion model. The resulting tracked interest points are more repeatable and resilient to noise, and descriptor computation becomes much more efficient because only those areas of the image pyramid that contain features are searched. We demonstrate our method on real-time object recognition and label augmentation running on a mobile device. 1.
Compact signatures for high-speed interest point description and matching
- In International Conference on Computer Vision
, 2009
"... Prominent feature point descriptors such as SIFT and SURF allow reliable real-time matching but at a computational cost that limits the number of points that can be handled on PCs, and even more on less powerful mobile devices. A recently proposed technique that relies on statistical classification ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Prominent feature point descriptors such as SIFT and SURF allow reliable real-time matching but at a computational cost that limits the number of points that can be handled on PCs, and even more on less powerful mobile devices. A recently proposed technique that relies on statistical classification to compute signatures has the potential to be much faster but at the cost of using very large amounts of memory, which makes it impractical for implementation on low-memory devices. In this paper, we show that we can exploit the sparseness of these signatures to compact them, speed up the computation, and drastically reduce memory usage. We base our approach on Compressive Sensing theory. We also highlight its effectiveness by incorporating it into two very different SLAM packages and demonstrating substantial performance increases. 1.
Robust feature matching in 2.3µs
- In IEEE CVPR Workshop on Feature Detectors and Descriptors: The State Of The Art and Beyond
, 2009
"... In this paper we present a robust feature matching scheme in which features can be matched in 2.3µs. For a typical task involving 150 features per image, this results in a processing time of 500µs for feature extraction and matching. In order to achieve very fast matching we use simple features base ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this paper we present a robust feature matching scheme in which features can be matched in 2.3µs. For a typical task involving 150 features per image, this results in a processing time of 500µs for feature extraction and matching. In order to achieve very fast matching we use simple features based on histograms of pixel intensities and an indexing scheme based on their joint distribution. The features are stored with a novel bit mask representation which requires only 44 bytes of memory per feature and allows computation of a dissimilarity score in 20ns. A training phase gives the patch-based features invariance to small viewpoint variations. Larger viewpoint variations are handled by training entirely independent sets of features from different viewpoints. A complete system is presented where a database of around 13,000 features is used to robustly localise a single planar target in just over a millisecond, including all steps from feature detection to model fitting. The resulting system shows comparable robustness to SIFT [8] and Ferns [14] while using a tiny fraction of the processing time, and in the latter case a fraction of the memory as well. 1.
MOPED: A Scalable and low Latency Object Recognition and Pose Estimation System
- in ICRA
, 2010
"... Abstract — The latency of a perception system is crucial for a robot performing interactive tasks in dynamic human environments. We present MOPED, a fast and scalable perception system for object recognition and pose estimation. MOPED builds on POSESEQ, a state of the art object recognition algorith ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract — The latency of a perception system is crucial for a robot performing interactive tasks in dynamic human environments. We present MOPED, a fast and scalable perception system for object recognition and pose estimation. MOPED builds on POSESEQ, a state of the art object recognition algorithm, demonstrating a massive improvement in scalability and latency without sacrificing robustness. We achieve this with both algorithmic and architecture improvements, with a novel feature matching algorithm, a hybrid GPU/CPU architecture that exploits parallelism at all levels, and an optimized resource scheduler. Using the same standard hardware, we achieve up to 30x improvement on real-world scenes. I.
The City of Sights: Design, Construction, and Measurement of an Augmented Reality Stage Set
"... views of the City of Sights, showing a virtual and a real representation of the total assembly. We describe the design and implementation of a physical and virtual model of an imaginary urban scene — the “City of Sights ” — that can serve as a backdrop or “stage ” for a variety of Augmented Reality ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
views of the City of Sights, showing a virtual and a real representation of the total assembly. We describe the design and implementation of a physical and virtual model of an imaginary urban scene — the “City of Sights ” — that can serve as a backdrop or “stage ” for a variety of Augmented Reality (AR) research. We argue that the AR research community would benefit from such a standard model dataset which can be used for evaluation of such AR topics as tracking systems, modeling, spatial AR, rendering tests, collaborative AR and user interface design. By openly sharing the digital blueprints and assembly instructions for our models, we allow the proposed set to be physically replicable by anyone and permit customization and experimental changes to the stage design which enable comprehensive exploration of algorithms and methods. Furthermore we provide an accompanying rich dataset consisting of video sequences under varying conditions with ground truth camera pose. We employed three different ground truth acquisition methods to support a broad range of use cases. The goal of our design is to enable and improve the replicability and evaluation of future augmented reality research. 1
Toward Augmenting Everything: Detecting and Tracking Geometrical Features on Planar Objects
- in "IEEE Int. Symp. on Mixed and Augmented Reality, ISMAR’11
, 2011
"... This paper presents an approach for detecting and tracking various types of planar objects with geometrical features. We combine traditional keypoint detectors with Locally Likely Arrangement Hashing (LLAH) [21] for geometrical feature based keypoint matching. Because the stability of keypoint extra ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper presents an approach for detecting and tracking various types of planar objects with geometrical features. We combine traditional keypoint detectors with Locally Likely Arrangement Hashing (LLAH) [21] for geometrical feature based keypoint matching. Because the stability of keypoint extraction affects the accuracy of the keypoint matching, we set the criteria of keypoint selection on keypoint response and the distance between keypoints. In order to produce robustness to scale changes, we build a non-uniform image pyramid according to keypoint distribution at each scale. In the experiments, we evaluate the applicability of traditional keypoint detectors with LLAH for the detection. We also compare our approach with SURF and finally demonstrate that it is possible to detect and track different types of textures including colorful pictures, binary fiducial markers and handwritings.
DISTRIBUTED MOBILE COMPUTER VISION AND APPLICATIONS ON THE ANDROID PLATFORM
"... This thesis describes the theory and implementation of both local and distributed systems for object recognition on the mobile Android platform. It further describes the possibilities and limitations of computer vision applications on modern mobile devices. Depending on the application, some or all ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This thesis describes the theory and implementation of both local and distributed systems for object recognition on the mobile Android platform. It further describes the possibilities and limitations of computer vision applications on modern mobile devices. Depending on the application, some or all of the computations may be outsourced to a server to improve performance. The object recognition methods used are based on local features. These features are extracted and matched against a known set of features in the mobile device or on the server depending on the implementation. In the thesis we describe local features using the popular SIFT and SURF algorithms. The matching is done using both simple exhaustive search and more advanced algorithms such as kd-tree best-bin-first search. To improve the quality of the matches in regards to false positives we have used different RANSAC type iterative methods. We describe two implementations of applications for single- and multi-object recognition, and a third, heavily optimized, SURF implementation to achieve near
Spatially Aware Handhelds for High-Precision Tangible Interaction with Large Displays
"... While touch-screen displays are becoming increasingly popular, many factors affect user experience and performance. Surface quality, parallax, input resolution, and robustness, for instance, can vary with sensing technology, hardware configurations, and environmental conditions. We have developed a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
While touch-screen displays are becoming increasingly popular, many factors affect user experience and performance. Surface quality, parallax, input resolution, and robustness, for instance, can vary with sensing technology, hardware configurations, and environmental conditions. We have developed a framework for exploring how we could overcome some of these dependencies, by leveraging the higher visual and input resolution of small, coarsely tracked mobile devices for direct, precise, and rapid interaction on large digital displays. The results from a formal user study show no significant differences in performance when comparing four techniques we developed for a tracked mobile device, where two existing touch-screen techniques served as baselines. The mobile techniques, however, had more consistent performance and smaller variations among participants, and an overall higher user preference in our setup. Our results show the potential of spatially aware handhelds as an interesting complement or substitute for direct touch-interaction on large displays.

