Results 1 - 10
of
21
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
- In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA
"... We consider the indexable dictionary problem, which consists of storing a set S ⊆ {0,...,m − 1} for some integer m, while supporting the operations of rank(x), which returns the number of elements in S that are less than x if x ∈ S, and −1 otherwise; and select(i) which returns the i-th smallest ele ..."
Abstract
-
Cited by 149 (5 self)
- Add to MetaCart
We consider the indexable dictionary problem, which consists of storing a set S ⊆ {0,...,m − 1} for some integer m, while supporting the operations of rank(x), which returns the number of elements in S that are less than x if x ∈ S, and −1 otherwise; and select(i) which returns the i-th smallest element in S. We give a data structure that supports both operations in O(1) time on the RAM model and requires B(n,m)+ o(n)+O(lg lg m) bits to store a set of size n, where B(n,m) = ⌈ lg ( m) ⌉ n is the minimum number of bits required to store any n-element subset from a universe of size m. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the O(lg lg m) additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: • an information-theoretically optimal representation of a k-ary cardinal tree that supports standard operations in constant time, • a representation of a multiset of size n from {0,...,m − 1} in B(n,m+n) + o(n) bits that supports (appropriate generalizations of) rank and select operations in constant time, and • a representation of a sequence of n non-negative integers summing up to m in B(n,m + n) + o(n) bits that supports prefix sum queries in constant time. 1
Compressed suffix trees with full functionality
- Theory of Computing Systems
"... We introduce new data structures for compressed suffix trees whose size are linear in the text size. The size is measured in bits; thus they occupy only O(n log |A|) bits for a text of length n on an alphabet A. This is a remarkable improvement on current suffix trees which require O(n log n) bits. ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
We introduce new data structures for compressed suffix trees whose size are linear in the text size. The size is measured in bits; thus they occupy only O(n log |A|) bits for a text of length n on an alphabet A. This is a remarkable improvement on current suffix trees which require O(n log n) bits. Though some components of suffix trees have been compressed, there is no linear-size data structure for suffix trees with full functionality such as computing suffix links, string-depths and lowest common ancestors. The data structure proposed in this paper is the first one that has linear size and supports all operations efficiently. Any algorithm running on a suffix tree can also be executed on our compressed suffix trees with a slight slowdown of a factor of polylog(n). 1
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
"... Suffix trees and suffix arrays are the most prominent full-text indices, and their construction algorithms are well studied. It has been open for a long time whether these indicescan be constructed in both o(n log n) time and o(n log n)-bit working space, where n denotes the length of the text. Int ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
Suffix trees and suffix arrays are the most prominent full-text indices, and their construction algorithms are well studied. It has been open for a long time whether these indicescan be constructed in both o(n log n) time and o(n log n)-bit working space, where n denotes the length of the text. Inthe literature, the fastest algorithm runs in O(n) time, whileit requires O(n log n)-bit working space. On the other hand,the most space-efficient algorithm requires O(n)-bit work-ing space while it runs in O(n log n) time. This paper breaks the long-standing time-and-space bar-rier under the unit-cost word RAM. We give an algorithm for constructing the suffix array which takes O(n) time and O(n)-bit working space, for texts with constant-size alpha-bets. Note that both the time and the space bounds are optimal. For constructing the suffix tree, our algorithm re-quires O(n logffl n) time and O(n)-bit working space forany 0! ffl! 1. Apart from that, our algorithm can alsobe adopted to build other existing full-text indices, such as
Exact and Approximate Distances in Graphs - a survey
- In ESA
, 2001
"... We survey recent and not so recent results related to the computation of exact and approximate distances, and corresponding shortest, or almost shortest, paths in graphs. We consider many different settings and models and try to identify some remaining open problems. ..."
Abstract
-
Cited by 43 (0 self)
- Add to MetaCart
We survey recent and not so recent results related to the computation of exact and approximate distances, and corresponding shortest, or almost shortest, paths in graphs. We consider many different settings and models and try to identify some remaining open problems.
Cell Probe Complexity - a Survey
- In 19th Conference on the Foundations of Software Technology and Theoretical Computer Science (FSTTCS), 1999. Advances in Data Structures Workshop
, 1999
"... The cell probe model is a general, combinatorial model of data structures. We give a survey of known results about the cell probe complexity of static and dynamic data structure problems, with an emphasis on techniques for proving lower bounds. 1 Introduction 1.1 The 'Were-you-last?' game A Dre ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
The cell probe model is a general, combinatorial model of data structures. We give a survey of known results about the cell probe complexity of static and dynamic data structure problems, with an emphasis on techniques for proving lower bounds. 1 Introduction 1.1 The 'Were-you-last?' game A Dream Team, consisting of m players, is held captive in the dungeon of their adversary, Hannibal. He now makes them play his favourite game, Were-you-last?. Before the game starts the players of the Team are allowed to meet to discuss a strategy (obviously, Hannibal has the room bugged and is listening in). After the discussion they are led to separate waiting rooms. Then Hannibal leads each of the players of the team, one by one, to the playing field. The players do not know the order in which they are led to the field and they spend their time there alone. The playing field is a room, containing an infinite number of boxes, labelled 0, 1, 2, 3, . . . . Inside each box is a switch that can be ...
Faster Deterministic Dictionaries
- In 11 th Annual ACM Symposium on Discrete Algorithms (SODA
, 1999
"... We consider static dictionaries over the universe U = on a unit-cost RAM with word size w. Construction of a static dictionary with linear space consumption and constant lookup time can be done in linear expected time by a randomized algorithm. In contrast, the best previous deterministic a ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
We consider static dictionaries over the universe U = on a unit-cost RAM with word size w. Construction of a static dictionary with linear space consumption and constant lookup time can be done in linear expected time by a randomized algorithm. In contrast, the best previous deterministic algorithm for constructing such a dictionary with n elements runs in time O(n ) for # > 0. This paper narrows the gap between deterministic and randomized algorithms exponentially, from the factor of to an O(log n) factor. The algorithm is weakly non-uniform, i.e. requires certain precomputed constants dependent on w. A by-product of the result is a lookup time vs insertion time trade-o# for dynamic dictionaries, which is optimal for a certain class of deterministic hashing schemes.
Hash and displace: Efficient evaluation of minimal perfect hash functions
- In Workshop on Algorithms and Data Structures
, 1999
"... A new way of constructing (minimal) perfect hash functions is described. The technique considerably reduces the overhead associated with resolving buckets in two-level hashing schemes. Evaluating a hash function requires just one multiplication and a few additions apart from primitive bit operations ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
A new way of constructing (minimal) perfect hash functions is described. The technique considerably reduces the overhead associated with resolving buckets in two-level hashing schemes. Evaluating a hash function requires just one multiplication and a few additions apart from primitive bit operations. The number of accesses to memory is two, one of which is to a fixed location. This improves the probe performance of previous minimal perfect hashing schemes, and is shown to be optimal. The hash function description (“program”) for a set of size n occupies O(n) words, and can be constructed in expected O(n) time. 1
Random Access to Grammar-Compressed Strings
, 2011
"... Let S be a string of length N compressed into a contextfree grammar S of size n. We present two representations of S achieving O(log N) random access time, and either O(n · αk(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, αk(n) is ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Let S be a string of length N compressed into a contextfree grammar S of size n. We present two representations of S achieving O(log N) random access time, and either O(n · αk(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, αk(n) is the inverse of the k th row of Ackermann’s function. Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and additional O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern P with at most k errors in time O(n(min{|P |k, k 4 + |P |} + log N) + occ), where occ is the number of occurrences of P in S. Finally, we are able to generalize our results to navigation and other operations on grammar-compressed trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two ”biased” weighted ancestor data structures, and a compact representation of heavy-paths in grammars.
Lower Bounds for Dynamic Algebraic Problems
, 1999
"... this paper appeared in Proc. 16th Annual Symposium on Theoretical Aspects of Computer Science. Lecture Notes in Computer Science 1563 (1999) pp. 362--372. ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
this paper appeared in Proc. 16th Annual Symposium on Theoretical Aspects of Computer Science. Lecture Notes in Computer Science 1563 (1999) pp. 362--372.
Two new methods for transforming priority queues into double-ended priority queues
- CPH STL Report
, 2006
"... Abstract. Two new ways of transforming a priority queue into a double-ended priority queue are introduced. These methods can be used to improve all known bounds for the comparison complexity of double-ended priority-queue operations. Using an efficient priority queue, the first transformation can pr ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
Abstract. Two new ways of transforming a priority queue into a double-ended priority queue are introduced. These methods can be used to improve all known bounds for the comparison complexity of double-ended priority-queue operations. Using an efficient priority queue, the first transformation can produce a doubleended priority queue which guarantees the worst-case cost of O(1) for find-min, find-max, and insert; and the worst-case cost of O(lg n) including at most lg n + O(1) element comparisons for delete, but the data structure cannot support meld efficiently. Using a meldable priority queue that supports decrease efficiently, the second transformation can produce a meldable double-ended priority queue which guarantees the worst-case cost of O(1) for find-min, find-max, and insert; the worst-case cost of O(lg n) including at most lg n + O(lg lg n) element comparisons for delete; and the worst-case cost of O(min {lg m, lg n}) for meld. Here, m and n denote the number of elements stored in the data structures prior to the operation in question, and lg n is a shorthand for log 2 (max {2, n}). 1.

