Results 1 - 10
of
15
COMPARISON OF LOCAL FEATURE DESCRIPTORS FOR MOBILE VISUAL SEARCH
"... We evaluate the performance of MPEG-7 image signatures, Compressed Histogram of Gradients descriptor (CHoG) and Scale Invariant Feature Transform (SIFT) descriptors for mobile visual search applications. We observe that SIFT and CHoG outperform MPEG-7 image signatures greatly in terms of feature-lev ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We evaluate the performance of MPEG-7 image signatures, Compressed Histogram of Gradients descriptor (CHoG) and Scale Invariant Feature Transform (SIFT) descriptors for mobile visual search applications. We observe that SIFT and CHoG outperform MPEG-7 image signatures greatly in terms of feature-level Receiver Operating Characteristic (ROC) performance and image-level matching. Moreover, CHoG descriptors demonstrate such gains while being comparable with MPEG-7 image signatures in bit-rate.
Toward Augmenting Everything: Detecting and Tracking Geometrical Features on Planar Objects
- in "IEEE Int. Symp. on Mixed and Augmented Reality, ISMAR’11
, 2011
"... This paper presents an approach for detecting and tracking various types of planar objects with geometrical features. We combine traditional keypoint detectors with Locally Likely Arrangement Hashing (LLAH) [21] for geometrical feature based keypoint matching. Because the stability of keypoint extra ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper presents an approach for detecting and tracking various types of planar objects with geometrical features. We combine traditional keypoint detectors with Locally Likely Arrangement Hashing (LLAH) [21] for geometrical feature based keypoint matching. Because the stability of keypoint extraction affects the accuracy of the keypoint matching, we set the criteria of keypoint selection on keypoint response and the distance between keypoints. In order to produce robustness to scale changes, we build a non-uniform image pyramid according to keypoint distribution at each scale. In the experiments, we evaluate the applicability of traditional keypoint detectors with LLAH for the detection. We also compare our approach with SURF and finally demonstrate that it is possible to detect and track different types of textures including colorful pictures, binary fiducial markers and handwritings.
Rapid image retrieval for mobile location recognition
- in Proc. IEEE Conf. Acoustics, Speech and Signal Processing
, 2011
"... Recognizing the location and orientation of a mobile device from captured images is a promising application of image retrieval algorithms. Matching the query images to an existing georeferenced database like Google Street View enables mobile search for location related media, products, and services. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Recognizing the location and orientation of a mobile device from captured images is a promising application of image retrieval algorithms. Matching the query images to an existing georeferenced database like Google Street View enables mobile search for location related media, products, and services. Due to the rapidly changing field of view of the mobile device caused by constantly changing user attention, very low retrieval times are essential. These can be significantly reduced by performing the feature quantization on the handheld and transferring compressed Bag-of-Feature vectors to the server. To cope with the limited processing capabilities of handhelds, the quantization of high dimensional feature descriptors has to be performed at very low complexity. To this end, we introduce in this paper the novel Multiple Hypothesis Vocabulary Tree (MHVT) as a step towards real-time mobile location recognition. The MHVT increases the probability of assigning matching feature descriptors to the same visual word by introducing an overlapping buffer around the separating hyperplanes to allow for a soft quantization and an adaptive clustering approach. Further, a novel framework is introduced that allows us to integrate the probability of correct quantization in the distance calculation using an inverted file scheme. Our experiments demonstrate that our approach achieves query times reduced by up to a factor of 10 when compared to the state-of-the-art.
Low-Cost Asset Tracking using Location-Aware Camera Phones
"... Maintaining an accurate and up-to-date inventory of one’s assets is a labor-intensive, tedious, and costly operation. To ease this difficult but important task, we design and implement a mobile asset tracking system for automatically generating an inventory by snapping photos of the assets with a sm ..."
Abstract
- Add to MetaCart
Maintaining an accurate and up-to-date inventory of one’s assets is a labor-intensive, tedious, and costly operation. To ease this difficult but important task, we design and implement a mobile asset tracking system for automatically generating an inventory by snapping photos of the assets with a smartphone. Since smartphones are becoming ubiquitous, construction and deployment of our inventory management solution is simple and costeffective. Automatic asset recognition is achieved by first segmenting individual assets out of the query photo and then performing bag-of-visual-features (BoVF) image matching on the segmented regions. The smartphone’s sensor readings, such as digital compass and accelerometer measurements, can be used to determine the location of each asset, and this location information is stored in the inventory for each recognized asset. As a special case study, we demonstrate a mobile book tracking system, where users snap photos of books stacked on bookshelves to generate a location-aware book inventory. It is shown that segmenting the book spines is very important for accurate feature-based image matching into a database of book spines. Segmentation also provides the exact orientation of each book spine, so more discriminative upright local features can be employed for improved recognition. This system’s mobile client has been implemented for smartphones running the Symbian or Android operating systems. The client enables a user to snap a picture of a bookshelf and to
FAST GEOMETRIC RE-RANKING FOR IMAGE-BASED RETRIEVAL
"... We present a fast and efficient geometric re-ranking method that can be incorporated in a feature based image-based retrieval system that utilizes a Vocabulary Tree (VT). We form feature pairs by comparing descriptor classification paths in the VT and calculate geometric similarity score of these pa ..."
Abstract
- Add to MetaCart
We present a fast and efficient geometric re-ranking method that can be incorporated in a feature based image-based retrieval system that utilizes a Vocabulary Tree (VT). We form feature pairs by comparing descriptor classification paths in the VT and calculate geometric similarity score of these pairs. We propose a location geometric similarity scoring method that is invariant to rotation, scale, and translation, and can be easily incorporated in mobile visual search and augmented reality systems. We compare the performance of the location geometric scoring scheme to orientation and scale geometric scoring schemes. We show in our experiments that re-ranking schemes can substantially improve recognition accuracy. We can also reduce the worst case server latency up to 1 sec and still improve the recognition performance. Index Terms — image-based retrieval, mobile visual search, robust features, geometric verification 1.
Mohammad Abu-Alqumsan, Anas Al-Nuaimi, and Eckehard Steinbach] [ Low-latency and robust visual localization] © INGRAM PUBLISHING
"... Information about the location, orientation, and context of a mobile device is of central importance for future multimedia applications and location-based services (LBSs). With the widespread adoption of modern camera phones, including powerful processors, inertial measurement units, compass, and as ..."
Abstract
- Add to MetaCart
Information about the location, orientation, and context of a mobile device is of central importance for future multimedia applications and location-based services (LBSs). With the widespread adoption of modern camera phones, including powerful processors, inertial measurement units, compass, and assisted global positioning system (GPS) receivers, the variety of locationand context-based services has significantly increased over the last years. These include, for instance, the search for points of interest in the vicinity, geotagging and retrieval of user generated media, targeted advertising, navigation systems, social applications such as Foursquare [1], and many more. Digital Object Identifier 10.1109/MSP.2011.940882 Date of publication: 15 June 2011 While satellite navigation systems can provide sufficient positioning accuracy, a clear view of at least four satellites is required, limiting its applicability to outdoor scenarios with few obstacles. Unfortunately, most interesting LBSs could be provided in densely populated environments, which include urban canyons and indoor scenarios. Figure 1 shows the GPS recordings (black line) of an iPhone 4 while driving a car through downtown San Francisco. Although a state-of-the-artassisted GPS Broadcom chip is used, the phone mounting ensures the best signal reception, and a motion model is applied to filter out large deviations; the localization error is in the range of 50–100 m. This is caused by multipath effects, which are even more severe if the user is traveling on the sidewalks and not in the middle of the street. Here, an initial positioning
Int J Comput Vis DOI 10.1007/s11263-011-0458-7 Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition
, 2010
"... Abstract Given a cell phone image of a building we address the problem of place-of-interest recognition in urban scenarios. Here, we go beyond what has been shown in earlier approaches by exploiting the nowadays often available 3D building information (e.g. from extruded floor plans) and massive str ..."
Abstract
- Add to MetaCart
Abstract Given a cell phone image of a building we address the problem of place-of-interest recognition in urban scenarios. Here, we go beyond what has been shown in earlier approaches by exploiting the nowadays often available 3D building information (e.g. from extruded floor plans) and massive street-level image data for database creation. Exploiting vanishing points in query images and thus fully removing 3D rotation from the recognition problem allows then to simplify the feature invariance to a purely homothetic problem, which we show enables more discriminative power in feature descriptors than classical SIFT. We rerank visual word based document queries using a fast stratified homothetic verification that in most cases boosts the correct document to top positions if it was in the short list. Since we exploit 3D building information, the approach finally outputs the camera pose in real world coordinates ready for augmenting the cell phone image with virtual 3D information. The whole system is demonstrated to outperform traditional approaches on city scale experiments for different
MOBILE AUGMENTED REALITY FOR BOOKS ON A SHELF
"... Retrieving information about books on a bookshelf by snapping a photo of book spines with a mobile device is very useful for bookstores, libraries, offices, and homes. In this paper, we develop a new mobile augmented reality system for book spine recognition. Our system achieves very low recognition ..."
Abstract
- Add to MetaCart
Retrieving information about books on a bookshelf by snapping a photo of book spines with a mobile device is very useful for bookstores, libraries, offices, and homes. In this paper, we develop a new mobile augmented reality system for book spine recognition. Our system achieves very low recognition delays, around 1 second, to support real-time augmentation on a mobile device’s viewfinder. We infer user interest by analyzing the motion of objects seen in the viewfinder. Our system initiates a query during each low-motion interval. This selection mechanism eliminates the need to press a button and avoids using degraded motion-blurred query frames during high-motion intervals. The viewfinder is augmented with a book’s identity, prices from different vendors, average user rating, location within the enclosing bookshelf, and a digital compass marker. We present a new tiled search strategy for finding the location in the bookshelf with improved accuracy in half the time as in a previous state-of-the-art system. Our AR system has been implemented on an Android smartphone.
ETH Zurich
"... With recent advances in mobile computing, the demand for visual localization or landmark identification on mobile devices is gaining interest. We advance the state of the art in this area by fusing two popular representations of streetlevel image data—facade-aligned and viewpoint-aligned— and show t ..."
Abstract
- Add to MetaCart
With recent advances in mobile computing, the demand for visual localization or landmark identification on mobile devices is gaining interest. We advance the state of the art in this area by fusing two popular representations of streetlevel image data—facade-aligned and viewpoint-aligned— and show that they contain complementary information that can be exploited to significantly improve the recall rates on the city scale. We also improve feature detection in low contrast parts of the street-level data, and discuss how to incorporate priors on a user’s position (e.g. given by noisy GPS readings or network cells), which previous approaches often ignore. Finally, and maybe most importantly, we present our results according to a carefully designed, repeatable evaluation scheme and make publicly available a set of 1.7 million images with ground truth labels, geotags, and calibration data, as well as a difficult set of cell phone query images. We provide these resources as a benchmark to facilitate further research in the area. 1.

