Results 1  10
of
15
An Optimal Algorithm for the Distinct Elements Problem
"... We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing, ne ..."
Abstract

Cited by 70 (6 self)
 Add to MetaCart
(Show Context)
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing, network topology, and data mining. For a stream of indices in {1,..., n}, our algorithm computes a (1 ± ε)approximation using an optimal O(ε −2 +log(n)) bits of space with 2/3 success probability, where 0 < ε < 1 is given. This probability can be amplified by independent repetition. Furthermore, our algorithm processes each stream update in O(1) worstcase time, and can report an estimate at any point midstream in O(1) worstcase time, thus settling both the space and time complexities simultaneously.
Resizable Arrays in Optimal Time and Space
, 1999
"... . We present simple, practical and efficient data structures for the fundamental problem of maintaining a resizable onedimensional array, A[l::l + n \Gamma 1], of fixedsize elements, as elements are added to or removed from one or both ends. Our structures also support access to the element in ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
. We present simple, practical and efficient data structures for the fundamental problem of maintaining a resizable onedimensional array, A[l::l + n \Gamma 1], of fixedsize elements, as elements are added to or removed from one or both ends. Our structures also support access to the element in position i. All operations are performed in constant time. The extra space (i.e., the space used past storing the n current elements) is O( p n) at any point in time. This is shown to be within a constant factor of optimal, even if there are no constraints on the time. If desired, each memory block can be made to have size 2 k \Gamma c for a specified constant c, and hence the scheme works effectively with the buddy system. The data structures can be used to solve a variety of problems with optimal bounds on time and extra storage. These include stacks, queues, randomized queues, priority queues, and deques. 1 Introduction The initial motivation for this research was a fundame...
TransDichotomous Algorithms without Multiplication  some Upper and Lower Bounds
, 1997
"... We show that on a RAM with addition, subtraction, bitwise Boolean operations and shifts, but no multiplication, there is a transdichotomous solution to the static dictionary problem using linear space and with query time . On the way, we show that two wbit words can be multiplied in time ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
We show that on a RAM with addition, subtraction, bitwise Boolean operations and shifts, but no multiplication, there is a transdichotomous solution to the static dictionary problem using linear space and with query time . On the way, we show that two wbit words can be multiplied in time (log w) and that time \Omega\Gammame/ w) is necessary, and that \Theta(log log w) time is necessary and sufficient for identifying the least significant set bit of a word.
Worst case constant time priority queue
 In Proc. 12th ACMSIAM Symposium on Discrete Algorithms
, 2001
"... We present a new data structure of size O(M) for solving the vEB problem. When this data structure is used in combination with a new memory topology it provides an O(1) worst case time solution. 1 ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We present a new data structure of size O(M) for solving the vEB problem. When this data structure is used in combination with a new memory topology it provides an O(1) worst case time solution. 1
Fast Functional Lists, HashLists, Deques and Variable Length Arrays
 In Implementation of Functional Languages, 14th International Workshop
, 2002
"... This paper introduces a new data structure, the VList, that is compact, thread safe and significantly faster to use than Linked Lists for nearly all list operations. Space usage can be reduced by 50% to 90% and in typical list operations speed improved by factors ranging from 4 to 20 or more. Some ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
This paper introduces a new data structure, the VList, that is compact, thread safe and significantly faster to use than Linked Lists for nearly all list operations. Space usage can be reduced by 50% to 90% and in typical list operations speed improved by factors ranging from 4 to 20 or more. Some important operations such as indexing and length are typically changed from O(N) to O(1) and O(lgN) respectively. A language interpreter Visp, using a dialect of Common Lisp, has been implemented using VLists and the benchmark comparison with OCAML reported. It is also shown how to adapt the structure to create variable length arrays, persistent deques and functional hash tables. The VArray requires no resize copying and has an average O(1) random access time. Comparisons are made with previous resizable one dimensional arrays, Hash Array Trees (HAT) Sitarski [1996], and Brodnik, Carlsson, Demaine, Munro, and Sedgewick [1999]
Fast Allocation and Deallocation with an Improved Buddy System
 IN PROCEEDINGS OF THE 19TH CONFERENCE ON THE FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FST & TCS'99), LECTURE NOTES IN COMPUTER SCIENCE
, 1999
"... We propose several modifications to the binary buddy system for managing dynamic allocation of memory blocks whose sizes are powers of two. The standard buddy system allocates and deallocates blocks in \Theta(lg n) time in the worst case (and on an amortized basis), where n is the size of the me ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
We propose several modifications to the binary buddy system for managing dynamic allocation of memory blocks whose sizes are powers of two. The standard buddy system allocates and deallocates blocks in \Theta(lg n) time in the worst case (and on an amortized basis), where n is the size of the memory. We present two schemes that improve the running time to O(1) time, where the time bound for deallocation is amortized. The first scheme uses one word of extra storage compared to the standard buddy system, but may fragment memory more than necessary. The second scheme has essentially the same fragmentation as the standard buddy system, and uses O(2 (1+ p lg n) lg lg n ) bits of auxiliary storage, which is !(lg k n) but o(n " ) for all k 1 and " ? 0. Finally, we present simulation results estimating the effect of the excess fragmentation in the first scheme.
Fast Allocation and Deallocation with an Improved Buddy System
, 2003
"... We propose several modifications to the binary buddy system for managing dynamic allocation of memory blocks whose sizes are powers of two. The standard buddy system allocates and deallocates blocks in Θ(lg n) time in the worst case (and on an amortized basis), where n is the size of the memory. We ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We propose several modifications to the binary buddy system for managing dynamic allocation of memory blocks whose sizes are powers of two. The standard buddy system allocates and deallocates blocks in Θ(lg n) time in the worst case (and on an amortized basis), where n is the size of the memory. We present three schemes that improve the running time to O(1) time, where the time bound for deallocation is amortized for the first two schemes. The first scheme uses just one more word of memory than the standard buddy system, but may result in greater fragmentation than necessary. The second and third schemes have essentially the same fragmentation as the standard buddy system, and use O(2 (1+ √ lg n) lg lg n) bits of auxiliary storage, which is ω(lg k n) but o(n ε) for all k ≥ 1 and ε> 0. Finally, we present simulation results estimating the effect of the excess fragmentation in the first scheme.
Sketching and Streaming HighDimensional Vectors
, 2011
"... A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sk ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sketch itself be computable by a smallspace algorithm given just one pass over the data, a socalled streaming algorithm. Sketching and streaming have found numerous applications in network traffic monitoring, data mining, trend detection, sensor networks, and databases. In this thesis, I describe several new contributions in the area of sketching and streaming algorithms. • The first spaceoptimal streaming algorithm for the distinct elements problem. Our algorithm also achieves O(1) update and reporting times. • A streaming algorithm for Hamming norm estimation in the turnstile model which achieves the best known space complexity.