Results 1  10
of
49
Order Preserving Encryption for Numeric Data
, 2004
"... Encryption is a well established technology for protecting sensitive data. However, once encrypted, data can no longer be easily queried aside from exact matches. We present an orderpreserving encryption scheme for numeric data that allows any comparison operation to be directly applied on encrypte ..."
Abstract

Cited by 197 (2 self)
 Add to MetaCart
(Show Context)
Encryption is a well established technology for protecting sensitive data. However, once encrypted, data can no longer be easily queried aside from exact matches. We present an orderpreserving encryption scheme for numeric data that allows any comparison operation to be directly applied on encrypted data. Query results produced are sound (no false hits) and complete (no false drops). Our scheme handles updates gracefully and new values can be added without requiring changes in the encryption of other values. It allows standard database indexes to be built over encrypted tables and can easily be integrated with existing database systems. The proposed scheme has been designed to be deployed in application environments in which the intruder can get access to the encrypted database, but does not have prior domain information such as the distribution of values and cannot encrypt or decrypt arbitrary values of his choice. The encryption is robust against estimation of the true value in such environments.
Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition
 Proceedings SAMT 2007 2007 vol. 4816, Lecture Notes in Computer Science,7890
"... Abstract. This paper reports on the setup and evaluation of robust speech recognition system parts, geared towards transcript generation for heterogeneous, reallife media collections. The system is deployed for generating speech transcripts for the NIST/TRECVID2007 test collection, part of a Dutc ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
(Show Context)
Abstract. This paper reports on the setup and evaluation of robust speech recognition system parts, geared towards transcript generation for heterogeneous, reallife media collections. The system is deployed for generating speech transcripts for the NIST/TRECVID2007 test collection, part of a Dutch reallife archive of newsrelated genres. Performance figures for this type of content are compared to figures for broadcast news test data. 1
Privacypreserving Queries over Relational Databases
"... Abstract—We explore how Private Information Retrieval (PIR) can help users keep their sensitive information from being leaked in an SQL query. We show how to retrieve data from a relational database with PIR by hiding sensitive constants contained in the predicates of a query. Experimental results a ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
(Show Context)
Abstract—We explore how Private Information Retrieval (PIR) can help users keep their sensitive information from being leaked in an SQL query. We show how to retrieve data from a relational database with PIR by hiding sensitive constants contained in the predicates of a query. Experimental results and microbenchmarking tests show our approach incurs reasonable storage overhead for the added privacy benefit and performs between 3 and 343 times faster than previous work. I.
External perfect hashing for very large key sets
, 2008
"... A perfect hash function (PHF) h: S → [0, m − 1] for a key set S ⊆ U of size n, where m ≥ n and U is a key universe, is an injective function that maps the keys of S to unique values. A minimal perfect hash function (MPHF) is a PHF with m = n, the smallest possible range. Minimal perfect hash functio ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
A perfect hash function (PHF) h: S → [0, m − 1] for a key set S ⊆ U of size n, where m ≥ n and U is a key universe, is an injective function that maps the keys of S to unique values. A minimal perfect hash function (MPHF) is a PHF with m = n, the smallest possible range. Minimal perfect hash functions are widely used for memory efficient storage and fast retrieval of items from static sets. In this paper we present a distributed and parallel version of a simple, highly scalable and nearspace optimal perfect hashing algorithm for very large key sets, recently presented in [4]. The sequential implementation of the algorithm constructs a MPHF for a set of 1.024 billion URLs of average length 64 bytes collected from the Web in approximately 50 minutes using a commodity PC. The parallel implementation proposed here presents the following performance using 14 commodity PCs: (i) it constructs a MPHF for the same set of 1.024 billion URLs in approximately 4 minutes; (ii) it constructs a MPHF for a set of 14.336 billion 16byte random integers in approximately 50 minutes with a performance degradation of 20%; (iii) one version of the parallel algorithm distributes the description of the MPHF among the participating machines and its evaluation is done in a distributed way, faster than the centralized function.
Perfect hashing for strings: Formalization and Algorithms
 IN PROC 7TH CPM
, 1996
"... Numbers and strings are two objects manipulated by most programs. Hashing has been wellstudied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings th ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
(Show Context)
Numbers and strings are two objects manipulated by most programs. Hashing has been wellstudied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in O(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(log log n) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edgeweighted tree that may be of independent interest.
Code generation for Mercury
 In Proceedings of the Twelfth International Conference on Logic Programming
, 1995
"... Mercury is a new purely declarative logic programming language that requires programmers to write declarations for every predicate in the program. Although the main motivation for this requirement is that it allows the compiler to catch most programmer errors, it also allows the Mercury code generat ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
Mercury is a new purely declarative logic programming language that requires programmers to write declarations for every predicate in the program. Although the main motivation for this requirement is that it allows the compiler to catch most programmer errors, it also allows the Mercury code generator to rely on the presence of type, mode and determinism information about every predicate in the program. The code generator exploits a new execution algorithm based on the availability of this information as well as some novel techniques (lazy code generation, followcode migration, the use of a compiletime failure continuation stack) to produce very high quality code. This code is in C and is therefore quite portable. Benchmarks show that the Mercury implementation produces much faster code than wamcc, Quintus Prolog, SICStus Prolog and Aquarius Prolog.
Bloomier filters: A second look
 Algorithms  ESA 2008, 16th Annual European Symposium
"... Abstract. A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of falsepositives. A Bloomier filter generalizes a Bloom filter to compactly store a function with a static support. In this article we give a ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Abstract. A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of falsepositives. A Bloomier filter generalizes a Bloom filter to compactly store a function with a static support. In this article we give a simple construction of a Bloomier filter. The construction is linear in space and requires constant time to evaluate. The creation of our Bloomier filter takes linear time which is faster than the existing construction. We show how one can improve the space utilization further at the cost of increasing the time for creating the data structure. 1
A practical minimal perfect hashing method
 In Proc. of the 4th International Workshop on Efficient and Experimental Algorithms (WEA’05
, 2005
"... Abstract. We propose a novel algorithm based on random graphs to construct minimal perfect hash functions h. For a set of n keys, our algorithm outputs h in expected time O(n). The evaluation of h(x) requires two memory accesses for any key x and the description of h takes up 1.15n words. This impro ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
(Show Context)
Abstract. We propose a novel algorithm based on random graphs to construct minimal perfect hash functions h. For a set of n keys, our algorithm outputs h in expected time O(n). The evaluation of h(x) requires two memory accesses for any key x and the description of h takes up 1.15n words. This improves the space requirement to 55 % of a previous minimal perfect hashing scheme due to Czech, Havas and Majewski. A simple heuristic further reduces the space requirement to 0.93n words, at the expense of a slightly worse constant in the time complexity. Large scale experimental results are presented. 1
Partial Automation of an Integrated Reverse Engineering Environment of Binary Code
 In Third Working Conference on Reverse Engineering
, 1996
"... The constant development of newer and faster machines requires software to be made available on those new machines at a rate faster than what it takes to develop the software. The use of binary translation techniques to migrate software from one machine to another is effectiveit makes software av ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
The constant development of newer and faster machines requires software to be made available on those new machines at a rate faster than what it takes to develop the software. The use of binary translation techniques to migrate software from one machine to another is effectiveit makes software available in little time without incurring reprogramming costs. However, the development of such a tool is in itself an issue, as with each new architecture, a new tool needs to be written. We present a partially automated integrated environment for the reverse engineering of binary or executable code. This environment is suitable for the development of disassemblers, binary translators and decompilers. 1. Introduction Reverse engineering of software systems has been defined as the analysis of a subject system to identify the system 's components and their interrelationships, and to create a representation of the system in another form or at a higher level of abstraction [5]. The aim of rever...
Hash and displace: Efficient evaluation of minimal perfect hash functions
 In Workshop on Algorithms and Data Structures
, 1999
"... A new way of constructing (minimal) perfect hash functions is described. The technique considerably reduces the overhead associated with resolving buckets in twolevel hashing schemes. Evaluating a hash function requires just one multiplication and a few additions apart from primitive bit operations ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
A new way of constructing (minimal) perfect hash functions is described. The technique considerably reduces the overhead associated with resolving buckets in twolevel hashing schemes. Evaluating a hash function requires just one multiplication and a few additions apart from primitive bit operations. The number of accesses to memory is two, one of which is to a fixed location. This improves the probe performance of previous minimal perfect hashing schemes, and is shown to be optimal. The hash function description (“program”) for a set of size n occupies O(n) words, and can be constructed in expected O(n) time. 1