Results 1 - 10
of
12
A Perfect Hash Function Generator
"... gperf is a "software-tool generating-tool" designed to automate the generation of perfect hash functions. This paper describes the features, algorithms, and object-oriented design and implementation strategies incorporated in gperf.Italso presents the results from an empirical comparison between gp ..."
Abstract
-
Cited by 49 (34 self)
- Add to MetaCart
gperf is a "software-tool generating-tool" designed to automate the generation of perfect hash functions. This paper describes the features, algorithms, and object-oriented design and implementation strategies incorporated in gperf.Italso presents the results from an empirical comparison between gperf-generated recognizers and other popular techniques for reserved word lookup. gperf is distributed with the GNU libg++ library and is used to generate the keyword recognizers for the GNU C and GNU C++ compilers. 1 Introduction Perfect hash functions are a time and space efficient implementation of static search sets, which are ADTs with operations like initialize, insert,andretrieve. Static search sets are common in system software applications. Typical static search sets include compiler and interpreter reserved words, assembler instruction mnemonics, and shell interpreter builtin commands. Search set elements are called keywords.Key- words are inserted into the set once, usually at c...
A Comparison of Hashing Schemes for Address Lookup in Computer Networks
- IEEE Transactions on Communications
, 1992
"... Using a trace of address references, we compared the efficiency of several different hashing functions, such as cyclic redundancy checking (CRC) polynomials, Fletcher checksum, folding of address octets using the exclusive-or operation, and bit extraction from the address. Guidelines are provided fo ..."
Abstract
-
Cited by 43 (1 self)
- Add to MetaCart
Using a trace of address references, we compared the efficiency of several different hashing functions, such as cyclic redundancy checking (CRC) polynomials, Fletcher checksum, folding of address octets using the exclusive-or operation, and bit extraction from the address. Guidelines are provided for determining the size of hash mask required to achieve a specified level of performance. 1 INTRODUCTION The trend toward networks becoming larger and faster, addresses becoming larger, has impelled a need to explore alternatives for fast address recognition. This problem is actually a special case of the general problem of searching through a large data base and finding the information associated with a given key. For example, Datalink adapters on local area networks (LAN) need to recognize the multicast destination addresses of frames on the LAN. Bridges, used to interconnect two or more LANs, have to recognize the destination addresses of every frame and decide quickly whether to receive...
Biosequence similarity search on the Mercury system
- In Proc. IEEE 15th Int. Conf. on Application-Specific Systems, Architectures and Processors (ASAP04
, 2004
"... Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely used similarity search tool for biosequences is BLAST, a program ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely used similarity search tool for biosequences is BLAST, a program designed to compare query sequences to a database. Here, we present the design of BLASTN, the version of BLAST that searches DNA sequences, on the Mercury system, an architecture that supports high-volume, high-throughput data movement off a data store and into reconfigurable hardware. An important component of application deployment on the Mercury system is the functional decomposition of the application onto both the reconfigurable hardware and the traditional processor. Both the Mercury BLASTNapplication design and its performance analysis are described. 1
Perfect Hash Table Algorithm For Image Databases Using Negative Associated Values
, 1995
"... A 2D string data structure allows for efficient spatial reasoning on an image database for query and retrieval. A 2D string can be converted to a set of triples leading to an elegant O(1) solution for image retrieval with simple queries using a perfect hash table. For complex queries, the retrieval ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
A 2D string data structure allows for efficient spatial reasoning on an image database for query and retrieval. A 2D string can be converted to a set of triples leading to an elegant O(1) solution for image retrieval with simple queries using a perfect hash table. For complex queries, the retrieval complexity is linear in this approach and depends on the number of possible pairings of picture objects in the query. The perfect hash table computation for this problem is mapped directly to a permutation problem. In an earlier paper [1], we presented a set of heuristics that result in a fast computation of associated values, for picture objects, used in the calculation of hash addresses. In this paper, we present an additional heuristic leading to a 90% reduction in search space over our earlier algorithm. The new heuristic promises to generate a minimal perfect hash function for each experimental data set, which was not possible with the earlier algorithms. Mathematical analysis of comple...
Efficient Distributed Subtyping Tests
- In Proceedings of the ACM International Conference on Distributed Event-based Systems
, 2007
"... Subtyping tests are essential in typed publish/subscribe infrastructures, especially when the underlying programming language supports subtype conformance, as in Java or C#. These tests are particularly challenging when the publish/subscribe infrastructure is distributed, because processes have dive ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Subtyping tests are essential in typed publish/subscribe infrastructures, especially when the underlying programming language supports subtype conformance, as in Java or C#. These tests are particularly challenging when the publish/subscribe infrastructure is distributed, because processes have diverging views and new types may be added in a decentralized manner. Maybe surprisingly, subtyping tests for such distributed systems have been devoted only little attention so far; they are usually strongly intertwined with serialization and code transfer mechanisms. This paper presents an efficient subtype testing method for event objects received through the wire, requiring neither the download of a full description of the types or classes of these objects nor their deserialization. We use a slicing technique that encodes a multiple subtyping hierarchy with as little memory as the best known centralized type encoding, but allows for the dynamic addition of event types without re-computing the encoding. We convey the practicality of our approach through performance measures obtained with standard Java libraries in a publish/subscribe system. Our approach performs between 3 and 12 times faster than a code transfer approach without adding overhead to object deserialization, and requires the same testing time as a straightforward stringbased type encoding while reducing the encoding length by a factor of 50.
A Letter-oriented Perfect Hashing Scheme Based upon Sparse Table Compression
, 1991
"... this paper, a new letter-oriented perfect hashing scheme based on Ziegler's row displacement method is presented. A unique n -tuple from a given set of static letter-oriented key words can be extracted by a heuristic algorithm. Then the extracted distinct n -tuples are associated with a 0/1 sparse m ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
this paper, a new letter-oriented perfect hashing scheme based on Ziegler's row displacement method is presented. A unique n -tuple from a given set of static letter-oriented key words can be extracted by a heuristic algorithm. Then the extracted distinct n -tuples are associated with a 0/1 sparse matrix. Using a sparse matrix compression technique, a perfect hashing function on the key words is then constructed
Implementing statically typed object-oriented programming languages
, 2002
"... Object-oriented programming languages represent an original implementation issue due to the mechanism known as late binding, aka message sending. The underlying principle is that the address of the actually called procedure is not statically determined, at compile-time, but depends on the dynamic ty ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Object-oriented programming languages represent an original implementation issue due to the mechanism known as late binding, aka message sending. The underlying principle is that the address of the actually called procedure is not statically determined, at compile-time, but depends on the dynamic type of a distinguished parameter known as the receiver. In statically typed languages, the point is that the receiver’s dynamic type may be a subtype of its static type. A similar issue arises with attributes, because their position in the object layout may depends on the object’s dynamic type. Furthermore, subtyping introduces another original feature, i.e. subtype checks. All three mechanisms need specific implementations, data structures and algorithms. In statically typed languages, late binding is generally implemented with tables, called virtual function tables in C++ jargon. These tables reduce method calls to function calls, through a small fixed number of extra indirections. It follows that object-oriented programming yields some overhead, as compared to usual procedural languages. The different techniques and their resulting overhead depend on several parameters. Firstly, inheritance and subtyping may be single or multiple and a mixing is even possible, as in Java,
Extensible Encoding of Type Hierarchies
"... The subtyping test consists of checking whether a type t is a descendant of a type r (Agrawal et al. 1989). We study how to perform such a test efficiently, assuming a dynamic hierarchy when new types are inserted at run-time. The goal is to achieve time and space efficiency, even as new types are i ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The subtyping test consists of checking whether a type t is a descendant of a type r (Agrawal et al. 1989). We study how to perform such a test efficiently, assuming a dynamic hierarchy when new types are inserted at run-time. The goal is to achieve time and space efficiency, even as new types are inserted. We propose an extensible scheme, named ESE, that ensures (1) efficient insertion of new types, (2) efficient subtyping tests, and (3) small space usage. On the one hand ESE provides comparable test times to the most efficient existing static schemes (e.g., Zibin et al. (2001)). On the other hand, ESE has comparable insertion times to the most efficient existing dynamic scheme (Baehni et al. 2007), while ESE outperforms it by a factor of 2-3 times in terms of space usage.
User's Guide for the GNU gperf Utility
, 1989
"... Data Type with certain fundamental operations, e.g., initialize, insert, and retrieve. Conceptually, all insertions occur before any retrievals. 1 It is a useful data structure for representing static search sets. Static search sets occur frequently in software system applications. Typical static ..."
Abstract
- Add to MetaCart
Data Type with certain fundamental operations, e.g., initialize, insert, and retrieve. Conceptually, all insertions occur before any retrievals. 1 It is a useful data structure for representing static search sets. Static search sets occur frequently in software system applications. Typical static search sets include compiler reserved words, assembler instruction opcodes, and built-in shell interpreter commands. Search set members, called keywords, are inserted into the structure only once, usually during program initialization, and are not generally modified at run-time. Numerous static search structure implementations exist, e.g., arrays, linked lists, binary search trees, digital search tries, and hash tables. Different approaches offer trade-offs between space utilization and search time efficiency. For example, an $n$ element sorted array is space efficient, though the average-case time complexity for retrieval operations using binary search is proportional to $"log n$. Converse...
Average Case Analysis for a Simple Compression Algorithm
, 1998
"... In this paper, we treat the static dictionary problem, very well-known in computer science. It consists in storing a set S of m elements in the range [1::n] so that membership queries on S's elements can be handled in O(1) time. It can be approached as a table compression problem in which a size n ..."
Abstract
- Add to MetaCart
In this paper, we treat the static dictionary problem, very well-known in computer science. It consists in storing a set S of m elements in the range [1::n] so that membership queries on S's elements can be handled in O(1) time. It can be approached as a table compression problem in which a size n table has m 1's and the other elements are 0's. We focus our attention on sparse cases (m n). We use a simple algorithm to solve the problem and make an average case analysis of the total space required when the input derives from uniform probability distribution. We also find some conditions able to minimize storage requirements. We then propose and analyze a new algorithm able to drastically reduce storage requirements to O(m 4=3 ): 1 Introduction The static dictionary problem (very well-known in computer science) consists in storing a set S of m elements in the range [1::n] in such a way that we are able to retrieve an item belonging to S in constant time. In this paper, we examine th...

