Results 11 - 20
of
23
Signatures for Library Functions in Executable Files
, 1993
"... A method for efficiently generating signatures for detecting library functions in executable files is described. The signatures are used to automatically detect such functions in dcc, the reverse compiler at the Queensland University of Technology. Difficulties arise from the variability of the sign ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A method for efficiently generating signatures for detecting library functions in executable files is described. The signatures are used to automatically detect such functions in dcc, the reverse compiler at the Queensland University of Technology. Difficulties arise from the variability of the signatures, the multiplicity of library code vendors, and of memory models, and indistinguishable functions. An efficient hashing technique involving perfect optimal hashing functions is used. Performance is good - the signature files are created in a few seconds, and the name of a library function can be found in about the time of two standard hashes. One signature file is required for each vendor, version, and memory model combination, and they are generated from the appropriate library file (e.g. slibce.lib). Some issues are yet to be addressed, such as variation due to floating point math options (e.g. emulator, fast alternate, or coprocessor calls). 1 Application Signatures are required w...
Trie Methods for Structured Data on Secondary Storage
, 2000
"... We apply the trie structures to indexing, storing and querying structured data on secondary storage. We are interested in the storage compactness, the I/O efficiency, the order-preserving properties, the general orthogonal range queries and the exact match queries for very large files and databases. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We apply the trie structures to indexing, storing and querying structured data on secondary storage. We are interested in the storage compactness, the I/O efficiency, the order-preserving properties, the general orthogonal range queries and the exact match queries for very large files and databases. We also apply the trie structures to relational joins (set operations). We compare trie structures to various data structures on secondary storage: multipaging and grid files in the direct access method category, R-trees/R*-trees and X-trees in the logarithmic access cost category, as well as some representative join algorithms for performing join operations. Our results show that range queries by trie method are superior to these competitors in search cost when queries return more than a few records and are competitive to direct access methods for exact match queries. Furthermore, as the trie structure compresses data, it is the winner in terms of storage compared to all other methods mentioned above. We also present a new tidy function for order-preserving key-to-address transformation. Our tidy function is easy to construct and cheaper in access time and storage cost compared to its closest competitor.
Finding Succinct Ordered Minimal Perfect Hash Functions
, 1994
"... An ordered minimal perfect hash table is one in which no collisions occur among a predefined set of keys, no space is unused and the data are placed in the table in order. A new method for creating ordered minimal perfect hash functions is presented. It creates hash functions with representation spa ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
An ordered minimal perfect hash table is one in which no collisions occur among a predefined set of keys, no space is unused and the data are placed in the table in order. A new method for creating ordered minimal perfect hash functions is presented. It creates hash functions with representation space requirements closer to the theoretical lower bound than previous methods. The method presented requires approximately 17% less space to represent generated hash functions and is easy to implement. However, a high time complexity makes it practical for small sets only (size ! 1000). Keywords: Data Structures, Hashing, Perfect Hashing 1 Introduction A hash table is a data structure in which a number of keyed items are stored. To access an item with a given key, a hash function is used. The hash function maps from the set of keys, to the set of locations of the table. If more than one key maps to a given location, a collision occurs, and some collision resolution policy must be followed. O...
Indexing Internal Memory with Minimal Perfect Hash Functions
"... Abstract. A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are widely used for memory efficient storage and fast retrieval of items from static sets. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. In the past, the amount of space to store an MPHF description was O(log n) bits per key and therefore similar to the overhead of space of other hashing schemes. Recent results on MPHFs by [Botelho et al. 2007] changed this scenario: in their work the space overhead of an MPHF is approximately 2.6 bits per key. The objective of this paper is to show that MPHFs are a good option to index internal memory when static key sets are involved and both successful and unsuccessful searches are allowed. We have shown that MPHFs provide the best tradeoff between space usage and lookup time when compared with linear hashing, quadratic hashing, double hashing, dense hashing, cuckoo hashing and sparse hashing. For example, MPHFs outperforms linear hashing, quadratic hashing and double hashing when these methods have a hash table occupancy of 75 % or higher (if the MPHF fits in the CPU cache the same happens for hash table occupancies greater than or equal to 55%). Furthermore, MPHFs also have a better performance in all measured aspects when compared to sparse hashing, which has been designed specifically for efficient memory usage. 1.
Perfect hashing for data management applications
, 2007
"... Perfect hash functions can potentially be used to compress data in connection with a variety of data management tasks. Though there has been considerable work on how to construct good perfect hash functions, there is a gap between theory and practice among all previous methods on minimal perfect has ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Perfect hash functions can potentially be used to compress data in connection with a variety of data management tasks. Though there has been considerable work on how to construct good perfect hash functions, there is a gap between theory and practice among all previous methods on minimal perfect hashing. On one side, there are good theoretical results without experimentally proven practicality for large key sets. On the other side, there are the theoretically analyzed time and space usage algorithms that assume that truly random hash functions are available for free, which is an unrealistic assumption. In this paper we attempt to bridge this gap between theory and practice, using a number of techniques from the literature to obtain a novel scheme that is theoretically well-understood and at the same time achieves an order-of-magnitude increase in performance compared to previous “practical ” methods. This improvement comes from a combination of a novel, theoretically optimal perfect hashing scheme that greatly simplifies previous methods, and the fact that our algorithm is designed to make good use of the memory hierarchy. We demonstrate the scalability of our algorithm by considering a set of over one billion URLs from the World Wide Web of average length 64, for which we construct a minimal perfect hash function on a commodity PC in a little more than 1 hour. Our scheme produces minimal perfect hash functions using slightly more than 3 bits per key. For perfect hash functions in the range {0,..., 2n −1} the space usage drops to just over 2 bits per key (i.e., one bit more than optimal for representing the key). This is significantly below of what has been achieved previously for very large values of n. 1.
Submitted to the Department of Electrical Engineering and
, 1999
"... In recent years, the recognition of handwritten mathematical expressions has recieved an increasing amount of attention in pattern recognition research. The diversity of approaches to the problem and the lack of a commercially viable system, however, indicate that there is still much research to be ..."
Abstract
- Add to MetaCart
In recent years, the recognition of handwritten mathematical expressions has recieved an increasing amount of attention in pattern recognition research. The diversity of approaches to the problem and the lack of a commercially viable system, however, indicate that there is still much research to be done in this area. In this thesis, I will describe an on-line approach for converting a handwritten mathematical expression into an equivalent expression in a typesetting command language such as T E X or MathML, as well as a feedback-oriented user interface which can make errors more tolerable to the end user since they can be quickly corrected.
Provable Bounds for Portable and Flexible Privacy-Preserving Access Rights
, 2005
"... In this work we address the problem of portable and flexible privacy-preserving access rights for large online data repositories. Privacy-preserving access control means that the service provider can neither learn what access rights a customer has nor link a request to access an item to a particula ..."
Abstract
- Add to MetaCart
In this work we address the problem of portable and flexible privacy-preserving access rights for large online data repositories. Privacy-preserving access control means that the service provider can neither learn what access rights a customer has nor link a request to access an item to a particular customer, thus maintaining privacy of both customer activity and customer access rights. Flexible access rights allow any customer to choose any subset of items from the repository and correspondingly be charged only for the items selected. And portability of access rights means that the rights themselves can be stored on small devices of limited storage space and computational capabilities, and therefore the rights must be enforced using the limited resources available. Our main results are solutions to the problem that utilize minimal perfect hash functions and order-preserving minimal perfect hash functions. None of them use expensive cryptography, all require very little space, and they are therefore suitable for computationally weak and spacelimited devices such as smartcards, sensors, etc. Performance of the schemes is measured as the probability of false positives (i.e., the probability that access to an unpurchased item will be permitted) for a given storage space bound. Using our techniques, for a data repository of size n and subscription order of m # n items, we achieve a probability of false positives of m using only O(cm) bits of storage space, where c is an adjustable parameter (a constant or otherwise) that can be set to provide the desired performance. This is the first time that such provable bounds are established for this problem, and we believe the techniques we use are of more general interest through the unusual use we make of perfect hashing.
Topic: Search GigaHash: Scalable Minimal Perfect Hashing for Billions of URLs
"... A minimal perfect function maps a static set of ..."
Blooming Trees for Minimal Perfect Hashing
"... Abstract—Hash tables are used in many networking applications, such as lookup and packet classification. But the issue of collisions resolution makes their use slow and not suitable for fast operations. Therefore, perfect hash functions have been introduced to make the hashing mechanism more efficie ..."
Abstract
- Add to MetaCart
Abstract—Hash tables are used in many networking applications, such as lookup and packet classification. But the issue of collisions resolution makes their use slow and not suitable for fast operations. Therefore, perfect hash functions have been introduced to make the hashing mechanism more efficient. In particular, a minimal perfect hash function is a function that maps a set of n keys into a set of n integer numbers without collisions. In literature, there are many schemes to construct a minimal perfect hash function, either based on mathematical properties of polynomials or on graph theory. This paper proposes a new scheme which shows remarkable results in terms of space consumption and processing speed. It is based on an alternative to Bloom Filters and requires about 4 bits per key and 12.8 seconds to construct a MPHF with 3.8 × 10 9 elements. I.
SPEECH AND
, 2005
"... This article was originally published in a journal published by Elsevier, and the attached copy is provided by Elsevier for the author’s benefit and for the benefit of the author’s institution, for non-commercial research and educational use including without limitation use in instruction at your in ..."
Abstract
- Add to MetaCart
This article was originally published in a journal published by Elsevier, and the attached copy is provided by Elsevier for the author’s benefit and for the benefit of the author’s institution, for non-commercial research and educational use including without limitation use in instruction at your institution, sending it to specific colleagues that you know, and providing a copy to your institution’s administrator. All other uses, reproduction and distribution, including without limitation commercial reprints, selling or licensing copies or access, or posting on open internet sites, your personal or institution’s website or repository, are prohibited. For exceptions, permission may be sought for such use through Elsevier’s permissions site at:

