Results 1 -
6 of
6
The Stanford Mobile Visual Search Data Set
"... We survey popular data sets used in computer vision literature and point out their limitations for mobile visual search applications. To overcome many of the limitations, we propose the Stanford Mobile Visual Search data set. The data set contains camera-phone images of products, CDs, books, outdoor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We survey popular data sets used in computer vision literature and point out their limitations for mobile visual search applications. To overcome many of the limitations, we propose the Stanford Mobile Visual Search data set. The data set contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum paintings and video clips. The data set has several key characteristics lacking in existing data sets: rigid objects, widely varying lighting conditions, perspective distortion, foreground and background clutter, realistic ground-truth reference data, and query data collected from heterogeneous low and high-end camera phones. We hope that the data set will help push research forward in the field of mobile visual search.
Survey and evaluation of audio fingerprinting schemes for mobile query-by-example applications
- in ISMIR
, 2011
"... We survey and evaluate popular audio fingerprinting schemes in a common framework with short query probes captured from cell phones. We report and discuss results important for mobile applications: Receiver Operating Characteristic (ROC) performance, size of fingerprints generated compared to size o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We survey and evaluate popular audio fingerprinting schemes in a common framework with short query probes captured from cell phones. We report and discuss results important for mobile applications: Receiver Operating Characteristic (ROC) performance, size of fingerprints generated compared to size of audio probe, and transmission delay if the fingerprint data were to be transmitted over a wireless link. We hope that the evaluation in this work will guide work towards reducing latency in practical mobile audio retrieval applications. 1.
Mohammad Abu-Alqumsan, Anas Al-Nuaimi, and Eckehard Steinbach] [ Low-latency and robust visual localization] © INGRAM PUBLISHING
"... Information about the location, orientation, and context of a mobile device is of central importance for future multimedia applications and location-based services (LBSs). With the widespread adoption of modern camera phones, including powerful processors, inertial measurement units, compass, and as ..."
Abstract
- Add to MetaCart
Information about the location, orientation, and context of a mobile device is of central importance for future multimedia applications and location-based services (LBSs). With the widespread adoption of modern camera phones, including powerful processors, inertial measurement units, compass, and assisted global positioning system (GPS) receivers, the variety of locationand context-based services has significantly increased over the last years. These include, for instance, the search for points of interest in the vicinity, geotagging and retrieval of user generated media, targeted advertising, navigation systems, social applications such as Foursquare [1], and many more. Digital Object Identifier 10.1109/MSP.2011.940882 Date of publication: 15 June 2011 While satellite navigation systems can provide sufficient positioning accuracy, a clear view of at least four satellites is required, limiting its applicability to outdoor scenarios with few obstacles. Unfortunately, most interesting LBSs could be provided in densely populated environments, which include urban canyons and indoor scenarios. Figure 1 shows the GPS recordings (black line) of an iPhone 4 while driving a car through downtown San Francisco. Although a state-of-the-artassisted GPS Broadcom chip is used, the phone mounting ensures the best signal reception, and a motion model is applied to filter out large deviations; the localization error is in the range of 50–100 m. This is caused by multipath effects, which are even more severe if the user is traveling on the sidewalks and not in the middle of the street. Here, an initial positioning
Towards Low Bit Rate Mobile Visual Search with Multiple-Channel Coding
"... In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compa ..."
Abstract
- Add to MetaCart
In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.
1 Network Assisted Mobile Computing with Optimal Uplink Query Processing
"... Abstract—Many mobile applications retrieve content from remote servers via user generated queries. Processing these queries is often needed before the desired content can be identified. Processing the request on the mobile devices can quickly sap the limited battery resources. Conversely, processing ..."
Abstract
- Add to MetaCart
Abstract—Many mobile applications retrieve content from remote servers via user generated queries. Processing these queries is often needed before the desired content can be identified. Processing the request on the mobile devices can quickly sap the limited battery resources. Conversely, processing user-queries at remote servers can have slow response times due communication latency incurred during transmission of the potentially large query. We evaluate a network-assisted mobile computing scenario where midnetwork nodes with “leasing ” capabilities are deployed by a service provider. Leasing computation power can reduce battery usage on the mobile devices and improve response times. However, borrowing processing power from mid-network nodes comes at a leasing cost which must be accounted for when making the decision of where processing should occur. We study the tradeoff between battery usage, processing and transmission latency, and mid-network leasing. We use the dynamic programming framework to solve for the optimal processing policies that suggest the amount of processing to be done at each mid-network node in order to minimize the processing and communication latency and processing costs. Through numerical studies, we examine the properties of the optimal processing policy and the core tradeoffs in such systems.
Improved Coding for Image Feature Location Information
"... In mobile visual search applications, an image-based query is typically sent from a mobile client to the server. Because of the bit-rate limitations, the query should be as small as possible. When performing image-based retrieval with local features, there are two types of information: the descripto ..."
Abstract
- Add to MetaCart
In mobile visual search applications, an image-based query is typically sent from a mobile client to the server. Because of the bit-rate limitations, the query should be as small as possible. When performing image-based retrieval with local features, there are two types of information: the descriptors of the image features and the locations of the image features within the image. Location information can be used to check geometric consistency of the set of features and thus improve the retrieval performance. To compress the location information, location histogram coding is an effective solution. We present a location histogram coder that reduces the bitrate by 2.8 × when compared to a fixed-rate scheme and 12.5 × when compared to a floating point representation of the locations. A drawback is the large context table which can be difficult to store in the coder and requires large training data. We propose a new sum-based context for coding the location histogram map. We show that it can reduce the context up to 200 × while being able to perform just as well as or better than previously proposed location histogram coders.

