Results 1  10
of
20
Succinct indexable dictionaries with applications to encoding kary trees and multisets
 In Proceedings of the 13th Annual ACMSIAM Symposium on Discrete Algorithms (SODA
"... We consider the indexable dictionary problem, which consists of storing a set S ⊆ {0,...,m − 1} for some integer m, while supporting the operations of rank(x), which returns the number of elements in S that are less than x if x ∈ S, and −1 otherwise; and select(i) which returns the ith smallest ele ..."
Abstract

Cited by 200 (9 self)
 Add to MetaCart
(Show Context)
We consider the indexable dictionary problem, which consists of storing a set S ⊆ {0,...,m − 1} for some integer m, while supporting the operations of rank(x), which returns the number of elements in S that are less than x if x ∈ S, and −1 otherwise; and select(i) which returns the ith smallest element in S. We give a data structure that supports both operations in O(1) time on the RAM model and requires B(n,m)+ o(n)+O(lg lg m) bits to store a set of size n, where B(n,m) = ⌈ lg ( m) ⌉ n is the minimum number of bits required to store any nelement subset from a universe of size m. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the O(lg lg m) additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: • an informationtheoretically optimal representation of a kary cardinal tree that supports standard operations in constant time, • a representation of a multiset of size n from {0,...,m − 1} in B(n,m+n) + o(n) bits that supports (appropriate generalizations of) rank and select operations in constant time, and • a representation of a sequence of n nonnegative integers summing up to m in B(n,m + n) + o(n) bits that supports prefix sum queries in constant time. 1
Reducing the Space Requirement of Suffix Trees
 Software – Practice and Experience
, 1999
"... We show that suffix trees store various kinds of redundant information. We exploit these redundancies to obtain more space efficient representations. The most space efficient of our representations requires 20 bytes per input character in the worst case, and 10.1 bytes per input character on average ..."
Abstract

Cited by 120 (11 self)
 Add to MetaCart
(Show Context)
We show that suffix trees store various kinds of redundant information. We exploit these redundancies to obtain more space efficient representations. The most space efficient of our representations requires 20 bytes per input character in the worst case, and 10.1 bytes per input character on average for a collection of 42 files of different type. This is an advantage of more than 8 bytes per input character over previous work. Our representations can be constructed without extra space, and as fast as previous representations. The asymptotic running times of suffix tree applications are retained. Copyright © 1999 John Wiley & Sons, Ltd. KEY WORDS: data structures; suffix trees; implementation techniques; space reduction
Space Efficient Hash Tables With Worst Case Constant Access Time
 In STACS
, 2003
"... We generalize Cuckoo Hashing [23] to dary Cuckoo Hashing and show how this yields a simple hash table data structure that stores n elements in (1 + ffl) n memory cells, for any constant ffl ? 0. Assuming uniform hashing, accessing or deleting table entries takes at most d = O(ln ffl ) probes ..."
Abstract

Cited by 47 (4 self)
 Add to MetaCart
(Show Context)
We generalize Cuckoo Hashing [23] to dary Cuckoo Hashing and show how this yields a simple hash table data structure that stores n elements in (1 + ffl) n memory cells, for any constant ffl ? 0. Assuming uniform hashing, accessing or deleting table entries takes at most d = O(ln ffl ) probes and the expected amortized insertion time is constant. This is the first dictionary that has worst case constant access time and expected constant update time, works with (1 + ffl) n space, and supports satellite information. Experiments indicate that d = 4 choices suffice for ffl 0:03. We also describe variants of the data structure that allow the use of hash functions that can be evaluted in constant time.
Succinct dynamic dictionaries and trees
 In Proceedings of the 30th International Colloquium on Automata, Languages and Programming (ICALP
, 2003
"... Abstract. We consider spaceefficient solutions to two dynamic data structuring problems. We first give a representation of a set S ⊆ U = {0,...,m − 1}, S  = n that supports membership queries in O(1) worst case time and insertions into/deletions from S in O(1) expected amortised time. The repre ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
(Show Context)
Abstract. We consider spaceefficient solutions to two dynamic data structuring problems. We first give a representation of a set S ⊆ U = {0,...,m − 1}, S  = n that supports membership queries in O(1) worst case time and insertions into/deletions from S in O(1) expected amortised time. The representation uses B+ o(B) bits, where B = lg
Bonsai: A Compact Representation of Trees
, 1993
"... This paper shows how trees can be stored in a very compact form, called `Bonsai', using hash tables. A method is described that is suitable for large trees that grow monotonically within a predefined maximum size limit. Using it, pointers in any tree can be represented within 6 +log 2 n bits pe ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
This paper shows how trees can be stored in a very compact form, called `Bonsai', using hash tables. A method is described that is suitable for large trees that grow monotonically within a predefined maximum size limit. Using it, pointers in any tree can be represented within 6 +log 2 n bits per node where n is the maximum number of children a node can have. We first describe a general way of storing trees in hash tables, and then introduce the idea of compact hashing which underlies the Bonsai structure. These two techniques are combined to give a compact representation of trees, and a practical methodology is set out to permit the design of these structures. The new representation is compared with two conventional tree implementations in terms of the storage required per node. Examples of programs that must store large trees within a strict maximum size include those that operate on trie structures derived from natural language text. We describe how the Bonsai technique has been applied to the trees that arise in text compression and adaptive prediction, and include a discussion of the design parameters that work well in practice
Don’t thrash: How to cache your hash on flash
 In Proceedings of the 38th International Conference on Very Large Data Bases
, 2012
"... As the Internet grows, computers collect, store, search, and index data at increasingly rapid rates. A recent IDC study estimates that more than 300 Exabytes of harddisk storage will be delivered in the next five years to ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
As the Internet grows, computers collect, store, search, and index data at increasingly rapid rates. A recent IDC study estimates that more than 300 Exabytes of harddisk storage will be delivered in the next five years to
Compact Dictionaries for VariableLength Keys and Data, with Applications
, 2007
"... We consider the problem of maintaining a dynamic dictionary T of keys and associated data for which both the keys and data are bit strings that can vary in length from zero up to the length w of a machine word. We present a data structure for this variablebitlength dictionary problem that supports ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We consider the problem of maintaining a dynamic dictionary T of keys and associated data for which both the keys and data are bit strings that can vary in length from zero up to the length w of a machine word. We present a data structure for this variablebitlength dictionary problem that supports constant time lookup and expected amortized constant time insertion and deletion. It uses O(m + 3n − n log 2 n) bits, where n is the number of elements in T, and m is the total number of bits across all strings in T (keys and data). Our dictionary uses an array A[1... n] in which locations store variablebitlength strings. We present a data structure for this variablebitlength array problem that supports worstcase constanttime lookups and updates and uses O(m + n) bits, where m is the total number of bits across all strings stored in A. The motivation for these structures is to support applications for which it is helpful to efficiently store short varying length bit strings. We present several applications, including representations for semidynamic graphs, order queries on integers sets, cardinal trees with varying cardinality, and simplicial meshes of d dimensions. These results either generalize or simplify previous results.
Compact Data Structures with Fast Queries
, 2005
"... Many applications dealing with large data structures can benefit from keeping them in compressed form. Compression has many benefits: it can allow a representation to fit in main memory rather than swapping out to disk, and it improves cache performance since it allows more data to fit into the c ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Many applications dealing with large data structures can benefit from keeping them in compressed form. Compression has many benefits: it can allow a representation to fit in main memory rather than swapping out to disk, and it improves cache performance since it allows more data to fit into the cache. However, a data structure is only useful if it allows the application to perform fast queries (and updates) to the data.
Cache, Hash and SpaceEfficient Bloom Filters
"... A Bloom filter is a very compact data structure that supports approximate membership queries on a set, allowing false positives. We propose several new variants of Bloom filters and replacements with similar functionality. All of them have a better cacheefficiency and need less hash bits than regu ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
A Bloom filter is a very compact data structure that supports approximate membership queries on a set, allowing false positives. We propose several new variants of Bloom filters and replacements with similar functionality. All of them have a better cacheefficiency and need less hash bits than regular Bloom filters. Some use SIMD functionality, while the others provide an even better space efficiency. As a consequence, we get a more flexible tradeoff between false positive rate, spaceefficiency, cacheefficiency, hashefficiency, and computational effort. We analyze the efficiency of Bloom filters and the proposed replacements in detail, in terms of the false positive rate, the number of expected cachemisses, and the number of required hash bits. We also describe and experimentally evaluate the performance of highlytuned implementations. For many settings, our alternatives perform better than the methods proposed so far.
Parallel Recursive State Compression for Free
 Proc. 18th Int. Spin Workshop on Model Checking Software, Springer Verlag, LNCS
, 2011
"... Abstract. State space exploration is a basic solution to many verification problems, but is limited by time and memory usage. Due to physical limits in modern CPUs, sequential exploration algorithms do not benefit automatically from the next generation of processors anymore, hence the need for multi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. State space exploration is a basic solution to many verification problems, but is limited by time and memory usage. Due to physical limits in modern CPUs, sequential exploration algorithms do not benefit automatically from the next generation of processors anymore, hence the need for multicore solutions. This paper focuses on reducing memory usage in enumerative model checking, while maintaining the multicore scalability obtained in earlier work. We present a treebased multicore compression method, which works by leveraging sharing among subvectors of state vectors. An algorithmic analysis of both worstcase and optimal compression ratios shows the potential to compress even large states to a small constant on average (8 bytes). Our experiments demonstrate that this holds up in practice: the median compression ratio of 279 measured experiments is within 17 % of the optimum for tree compression, and five times better than the median compression ratio of Spin’s Collapse compression. Our algorithms are implemented in the LTSmin tool, and our experiments show that for model checking, multicore tree compression pays its own way: it comes virtually without overhead compared to the fastest hash tablebased methods. 1