Results 11 - 20
of
31
Hashing for Similarity Search: A Survey
, 2014
"... Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this pap ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space.
Smart hashing update for fast response
- In IJCAI
, 2013
"... Recent years have witnessed the growing popular-ity of hash function learning for large-scale da-ta search. Although most existing hashing-based methods have achieved promising performance, they are regarded as passive hashing and assume that the labelled pairs are provided in advance. In this paper ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Recent years have witnessed the growing popular-ity of hash function learning for large-scale da-ta search. Although most existing hashing-based methods have achieved promising performance, they are regarded as passive hashing and assume that the labelled pairs are provided in advance. In this paper, we consider updating a hashing mod-el upon gradually increased labelled data in a fast response to users, called smart hashing update (SHU). In order to get a fast response to users, SHU aims to select a small set of hash functions to re-learn and only updates the corresponding hash bits of all data points. More specifically, we put forward two selection methods for performing efficient and effective update. In order to reduce the response time for acquiring a stable hashing code, we also propose an accelerated method to further reduce in-teractions between users and the computer. We e-valuate our proposals on two benchmark data sets. Our experimental results show it is not necessary to update all hash bits in order to adapt the model to new input data, and our model obtains better or similar performance without sacrificing much ac-curacy against the batch mode update. 1
Large-Scale Video Hashing via Structure Learning
"... Recently, learning based hashing methods have become popular for indexing large-scale media data. Hashing methods map high-dimensional features to compact binary codes that are efficient to match and robust in preserving original similarity. However, most of the existing hashing methods treat videos ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Recently, learning based hashing methods have become popular for indexing large-scale media data. Hashing methods map high-dimensional features to compact binary codes that are efficient to match and robust in preserving original similarity. However, most of the existing hashing methods treat videos as a simple aggregation of independent frames and index each video through combining the indexes of frames. The structure information of videos, e.g., discriminative local visual commonality and temporal consistency, is often neglected in the design of hash functions. In this paper, we propose a supervised method that explores the structure learning techniques to design efficient hash functions. The proposed video hashing method formulates a minimization problem over a structure-regularized empirical loss. In particular, the structure regularization exploits the common local visual patterns occurring in video frames that are associated with the same semantic class, and simultaneously preserves the temporal consistency over successive frames from the same video. We show that the minimization objective can be efficiently solved by an Accelerated Proximal Gradient (APG) method. Extensive experiments on two large video benchmark datasets (up to around 150K video clips with over 12 million frames) show that the proposed method significantly outperforms the state-ofthe-art hashing methods. 1.
Non-transitive Hashing with Latent Similarity Components
"... Approximating the semantic similarity between entities in the learned Hamming space is the key for supervised hash-ing techniques. The semantic similarities between entities are often non-transitive since they could share different la-tent similarity components. For example, in social networks, we c ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Approximating the semantic similarity between entities in the learned Hamming space is the key for supervised hash-ing techniques. The semantic similarities between entities are often non-transitive since they could share different la-tent similarity components. For example, in social networks, we connect with people for various reasons, such as shar-ing common interests, working in the same company, being alumni and so on. Obviously, these social connections are non-transitive if people are connected due to different rea-sons. However, existing supervised hashing methods treat the pairwise similarity relationships in a simple and unified way and project data into a single Hamming space, while neglecting that the non-transitive property cannot be ade-quately captured by a single Hamming space. In this pa-
What is the most efficient way to select nearest neighbor candidates for fast approximate nearest neighbor search
- In Proc. 14th International Conference on Computer Vision
, 2013
"... Abstract Approximate nearest neighbor search (ANNS) is a basic and important technique used in many tasks ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract Approximate nearest neighbor search (ANNS) is a basic and important technique used in many tasks
FOREST HASHING: EXPEDITING LARGE SCALE IMAGE RETRIEVAL
"... This paper introduces a hybrid method for searching large image datasets for approximate nearest neighbor items, specifically SIFT descriptors. The basic idea behind our method is to create a serial system that first partitions approximate nearest neighbors using multiple kd-trees before calling upo ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
This paper introduces a hybrid method for searching large image datasets for approximate nearest neighbor items, specifically SIFT descriptors. The basic idea behind our method is to create a serial system that first partitions approximate nearest neighbors using multiple kd-trees before calling upon locally designed spectral hashing tables for retrieval. This combination gives us the local approximate nearest neighbor accuracy of kd-trees with the computational efficiency of hashing techniques. Experimental results show that our approach efficiently and accurately outperforms previous methods designed to achieve similar goals. Index Terms — image retrieval, kd-tree, spectral hashing, forest hashing. 1.
VISUAL SEARCH
"... This thesis must be used in accordance with the provisions of the Copyright Act 1968. Reproduction of material protected by copyright may be an infringement of copyright and copyright owners may be entitled to take legal action against persons who infringe their copyright. Section 51 (2) of the Copy ..."
Abstract
- Add to MetaCart
(Show Context)
This thesis must be used in accordance with the provisions of the Copyright Act 1968. Reproduction of material protected by copyright may be an infringement of copyright and copyright owners may be entitled to take legal action against persons who infringe their copyright. Section 51 (2) of the Copyright Act permits an authorized officer of a university library or archives to provide a copy (by communication or otherwise) of an unpublished thesis kept in the library or archives, to a person who satisfies the authorized officer that he or she requires the reproduction for the purposes of research or study. The Copyright Act grants the creator of a work a number of moral rights, specifically the right of attribution, the right against false attribution and the right of integrity. You may infringe the author’s moral rights if you:- fail to acknowledge the author of this thesis if you quote sections from the work- attribute this thesis to another author- subject this thesis to derogatory treatment which may prejudice the author’s reputation For further information contact the University’s
Document Retrieval in Big Data
"... Abstract—Nearest Neighbor Search for similar document retrieval suffers from an efficiency problem when scaled to a large dataset. In this paper, we introduce an unsupervised approach based on Locality Sensitive Hashing to alleviate its search complexity problem. The advantage of our proposed approa ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Nearest Neighbor Search for similar document retrieval suffers from an efficiency problem when scaled to a large dataset. In this paper, we introduce an unsupervised approach based on Locality Sensitive Hashing to alleviate its search complexity problem. The advantage of our proposed approach is that it does not need to scan all the documents for retrieving top-K Nearest Neighbors, instead, a number of hash table lookup operations are conducted to retrieve the top-K candidates. Experiments on two massive news and tweets datasets demonstrate that our approach is able to achieve over an order of speedup compared with the traditional Information Retrieval method and maintain reasonable precision.