Results 1  10
of
16
Privacypreserving Queries over Relational Databases
"... Abstract—We explore how Private Information Retrieval (PIR) can help users keep their sensitive information from being leaked in an SQL query. We show how to retrieve data from a relational database with PIR by hiding sensitive constants contained in the predicates of a query. Experimental results a ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
Abstract—We explore how Private Information Retrieval (PIR) can help users keep their sensitive information from being leaked in an SQL query. We show how to retrieve data from a relational database with PIR by hiding sensitive constants contained in the predicates of a query. Experimental results and microbenchmarking tests show our approach incurs reasonable storage overhead for the added privacy benefit and performs between 3 and 343 times faster than previous work. I.
Theory and Practise of Monotone Minimal Perfect Hashing
"... Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, orderpreserving minimal perfect hash functions [12] have been used to retrieve the position of a key in a given list of keys: however, the ability to preserve any given orde ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, orderpreserving minimal perfect hash functions [12] have been used to retrieve the position of a key in a given list of keys: however, the ability to preserve any given order leads to an unavoidable �(n log n) lower bound on the number of bits required to store the function. Recently, it was observed [1] that very frequently the keys to be hashed are sorted in their intrinsic (i.e., lexicographical) order. This is typically the case of dictionaries of search engines, list of URLs of web graphs, etc. We refer to this restricted version of the problem as monotone minimal perfect hashing. We analyse experimentally the data structures proposed in [1], and along our way we propose some new methods that, albeit asymptotically equivalent or worse, perform very well in practise, and provide a balance between access speed, ease of construction, and space usage. 1
D.: Model checking via delayed duplicate detection on the GPU
, 2008
"... In this paper we improve largescale diskbased model checking by shifting complex numerical operations to the graphic card, enjoying that during the last decade graphics processing units (GPUs) have become very powerful. For diskbased graph search, the delayed elimination of duplicates is the perf ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
In this paper we improve largescale diskbased model checking by shifting complex numerical operations to the graphic card, enjoying that during the last decade graphics processing units (GPUs) have become very powerful. For diskbased graph search, the delayed elimination of duplicates is the performance bottleneck as it amounts to sorting large state vector sets. We perform parallel processing on the GPU to improve the sorting speed significantly. Since existing GPU sorting solutions like Bitonic Sort and Quicksort do not obey any speedup on state vectors, we propose a refined GPUbased Bucket Sort algorithm. Alternatively, we study sorting a compressed state vector and obtain speedups for delayed duplicate detection of more than one order of magnitude with a single GPU, located on an ordinary graphic card. 1
Parallel State Space Search on the GPU
, 2009
"... This paper exploits parallel computing power of the graphics card for the enhanced enumeration of state spaces. We illustrate that modern graphics processing units (GPUs) have the potential to speed up state space search significantly. For an bitvector representation of the search frontier, GPU algo ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This paper exploits parallel computing power of the graphics card for the enhanced enumeration of state spaces. We illustrate that modern graphics processing units (GPUs) have the potential to speed up state space search significantly. For an bitvector representation of the search frontier, GPU algorithms with one and two bits per state are presented. For enhanced compression efficient perfect hash functions and their inverse are studied. We establish maximal speedups of up to factor 30 and more wrt. single core computation.
Semiexternal LTL model checking
 In ComputerAided Verification (CAV
, 2008
"... Abstract. In this paper we establish cbit semiexternal graph algorithms, – i.e., algorithms which need only a constant number c of bits per vertex in the internal memory. In this setting, we obtain new tradeoffs between time and space for I/O efficient LTL model checking. First, we design a cbi ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract. In this paper we establish cbit semiexternal graph algorithms, – i.e., algorithms which need only a constant number c of bits per vertex in the internal memory. In this setting, we obtain new tradeoffs between time and space for I/O efficient LTL model checking. First, we design a cbit semiexternal algorithm for depthfirst search. To achieve a low internal memory consumption, we construct a RAMefficient perfect hash function from the vertex set stored on disk. We give a similar algorithm for double depthfirst search, which checks for presence of accepting cycles and thus solves the LTL model checking problem. The I/O complexity of the search itself is proportional to the time for scanning the search space. For onthefly model checking we apply iterativedeepening strategy known from bounded model checking. 1
Practical BatchUpdatable External Hashing with Sorting
"... This paper presents a practical external hashing scheme that supports fast lookup (7 microseconds) for large datasets (millions to billions of items) with a small memory footprint (2.5 bits/item) and fast index construction (151 K items/s for 1KiB keyvalue pairs). Our scheme combines three key tec ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This paper presents a practical external hashing scheme that supports fast lookup (7 microseconds) for large datasets (millions to billions of items) with a small memory footprint (2.5 bits/item) and fast index construction (151 K items/s for 1KiB keyvalue pairs). Our scheme combines three key techniques: (1) a new index data structure (EntropyCoded Tries); (2) the use of sorting as the main data manipulation method; and (3) support for incremental index construction for dynamic datasets. We evaluate our scheme by building an external dictionary on flashbased drives and demonstrate our scheme’s high performance, compactness, and practicality. 1
Practical perfect hashing in nearly optimal space
 Information Systems
"... A hash function is a mapping from a key universe U to a range of integers, i.e., h: U↦→{0, 1,...,m−1}, where m is the range’s size. A perfect hash function for some set S ⊆ U is a hash function that is onetoone on S, where m≥S. A minimal perfect hash function for some set S ⊆ U is a perfect hash ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A hash function is a mapping from a key universe U to a range of integers, i.e., h: U↦→{0, 1,...,m−1}, where m is the range’s size. A perfect hash function for some set S ⊆ U is a hash function that is onetoone on S, where m≥S. A minimal perfect hash function for some set S ⊆ U is a perfect hash function with a range of minimum size, i.e., m=S. This paper presents a construction for (minimal) perfect hash functions that combines theoretical analysis, practical performance, expected linear construction time and nearly optimal space consumption for the data structure. For n keys and m=n the space consumption ranges from 2.62n to 3.3n bits, and for m=1.23n it ranges from 1.95n to 2.7n bits. This is within a small constant factor from the theoretical lower bounds of 1.44n bits for m=n and 0.89n bits for m=1.23n. We combine several theoretical results into a practical solution that has turned perfect hashing into a very compact data structure to solve the membership problem when the key set S is static and known in advance. By taking into account the memory hierarchy we can construct (minimal) perfect hash functions for over a billion keys in 46 minutes using a commodity PC. An open source implementation of the algorithms is available
Document Vector Representations for Feature Extraction in MultiStage Document Ranking
, 2012
"... We consider a multistage retrieval architecture consisting of a fast, “cheap ” candidate generation stage, a feature extraction stage, and a more “expensive” reranking stage using machinelearned models. In this context, feature extraction can be accomplished using a document vector index, a mappin ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We consider a multistage retrieval architecture consisting of a fast, “cheap ” candidate generation stage, a feature extraction stage, and a more “expensive” reranking stage using machinelearned models. In this context, feature extraction can be accomplished using a document vector index, a mapping from document ids to document representations. We consider alternative organizations of such a data structure for efficient feature extraction: design choices include how document terms are organized, how complex term proximity features are computed, and how these structures are compressed. In particular, we propose a novel documentadaptive hashing scheme for compactly encoding term ids. The impact of alternative designs on both feature extraction speed and memory footprint is experimentally evaluated. Overall, results show that our architecture is comparable in speed to using a traditional positional inverted index but requires less memory overall, and offers additional advantages in terms of flexibility.
NearOptimal Space Perfect Hashing Algorithms
"... Abstract. A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values. Since no collisions occur, each key can be retrieved from a hash table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the ha ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values. Since no collisions occur, each key can be retrieved from a hash table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. The study of perfect hash functions started in the early 80s, when it was proved that the theoretic information lower bound to describe a minimal perfect hash function was approximately 1.44 bits per key. Although the proof indicates that it would be possible to build an algorithm capable of generating optimal functions, no one was able to obtain a practical algorithm that could be used in real applications. Thus, there was a gap between theory and practice. The main result of the thesis filled this gap, lowering the space complexity to represent MPHFs that are useful in practice from O(n log n) to O(n) bits. This allows the use of perfect hashing in applications to which it was not considered a good option. This explicit construction of PHFs is something that the data structures and algorithms community has been looking for since the 1980s. 1.
Link Discovery with Guaranteed Reduction Ratio in Affine Spaces with Minkowski Measures
 ISWC 2012, PART I. LNCS
, 2012
"... Timeefficient algorithms are essential to address the complex linking tasks that arise when trying to discover links on the Web of Data. Although several lossless approaches have been developed for this exact purpose, they do not offer theoretical guarantees with respect to their performance. In t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Timeefficient algorithms are essential to address the complex linking tasks that arise when trying to discover links on the Web of Data. Although several lossless approaches have been developed for this exact purpose, they do not offer theoretical guarantees with respect to their performance. In this paper, we address this drawback by presenting the first Link Discovery approach with theoretical quality guarantees. In particular, we prove that given an achievable reduction ratio r, our Link Discovery approach HR 3 can achieve a reduction ratio r ′ ≤ r in a metric space where distances are measured by the means of a Minkowski metric of any order p ≥ 2. We compare HR 3 and the HYPPO algorithm implemented in LIMES 0.5 with respect to the number of comparisons they carry out. In addition, we compare our approach with the algorithms implemented in the stateoftheart frameworks LIMES 0.5 and SILK 2.5 with respect to runtime. We show that HR³ outperforms these previous approaches with respect to runtime in each of our four experimental setups.