Results 1 - 10
of
7,078
Counting Distinct Elements in a Data Stream
, 2002
"... We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± epsilon. Our algorithms improve upon known algorithms for this problem, and offer a spectrum of time/space tradeoffs. ..."
Abstract
-
Cited by 191 (4 self)
- Add to MetaCart
We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± epsilon. Our algorithms improve upon known algorithms for this problem, and offer a spectrum of time/space tradeoffs.
An Optimal Algorithm for the Distinct Elements Problem
"... We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing, ne ..."
Abstract
-
Cited by 67 (7 self)
- Add to MetaCart
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing
Streaming Algorithms for Robust Distinct Elements
"... We study the problem of estimating distinct elements in the data stream model, which has a central role in traffic mon-itoring, query optimization, data mining and data integra-tion. Different from all previous work, we study the problem in the noisy data setting, where two different looking items i ..."
Abstract
- Add to MetaCart
We study the problem of estimating distinct elements in the data stream model, which has a central role in traffic mon-itoring, query optimization, data mining and data integra-tion. Different from all previous work, we study the problem in the noisy data setting, where two different looking items
Tight lower bounds for the distinct elements problem
- In FOCS
, 2003
"... We prove strong lower bounds for the space complexity of ¢¤£¦¥¨§� ©-approximating the number of distinct elements �� � in a data stream. Let � be the size of the universe from which the stream elements are drawn. We show that any one-pass streaming algorithm for ¢¤£¦¥¨§� ©-approximating � � must us ..."
Abstract
-
Cited by 59 (10 self)
- Add to MetaCart
We prove strong lower bounds for the space complexity of ¢¤£¦¥¨§� ©-approximating the number of distinct elements �� � in a data stream. Let � be the size of the universe from which the stream elements are drawn. We show that any one-pass streaming algorithm for ¢¤£¦¥¨§� ©-approximating � � must
Investor Protection and Corporate Governance
, 1999
"... Recent research on corporate governance has documented large differences between countries in ownership concentration in publicly traded firms, in the breadth and depth of financial markets, and in the access of firms to external finance. We suggest that there is a common element to the explanations ..."
Abstract
-
Cited by 590 (11 self)
- Add to MetaCart
Recent research on corporate governance has documented large differences between countries in ownership concentration in publicly traded firms, in the breadth and depth of financial markets, and in the access of firms to external finance. We suggest that there is a common element
A classification and comparison framework for software architecture description languages
- IEEE Transactions on Software Engineering
, 2000
"... Software architectures shift the focus of developers from lines-of-code to coarser-grained architectural elements and their overall interconnection structure. Architecture description languages (ADLs) have been proposed as modeling notations to support architecture-based development. There is, howev ..."
Abstract
-
Cited by 855 (59 self)
- Add to MetaCart
Software architectures shift the focus of developers from lines-of-code to coarser-grained architectural elements and their overall interconnection structure. Architecture description languages (ADLs) have been proposed as modeling notations to support architecture-based development. There is
Probabilistic Counting Algorithms for Data Base Applications
, 1985
"... This paper introduces a class of probabilistic counting lgorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words) a ..."
Abstract
-
Cited by 444 (6 self)
- Add to MetaCart
This paper introduces a class of probabilistic counting lgorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words
APPARC PCA5 - Parallelizing Distinct Element Simulations
"... The parallelization of distinct element models is discussed with particular attention being paid to the problem of load balancing. One method for solving the load balancing problem through the use of a local, dynamic procedure is given and an implementation on the Cray-T3D and the Intel Paragon is d ..."
Abstract
- Add to MetaCart
The parallelization of distinct element models is discussed with particular attention being paid to the problem of load balancing. One method for solving the load balancing problem through the use of a local, dynamic procedure is given and an implementation on the Cray-T3D and the Intel Paragon
Data Streams as Random Permutations: the Distinct Element Problem
"... In this paper, we show that data streams can sometimes usefully be studied as random permutations. This simple observation allows a wealth of classical and recent results from combinatorics to be recycled, with minimal effort, as estimators for various statistics over data streams. We illustrate thi ..."
Abstract
- Add to MetaCart
this by introducing RECORDINALITY, an algorithm which estimates the number of distinct elements in a stream by counting the number of k-records occurring in it. The algorithm has a score of interesting properties, such as providing a random sample of the set underlying the stream. To the best of our knowledge, a
The Average-Case Complexity of Counting Distinct Elements
"... We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1±ɛ) factor. It is known that if the stream may consist of arbitrary data arriving in an arbitrary order, then any 1-pass algorithm requires Ω(1/ɛ 2) bits of space to perform this task. T ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1±ɛ) factor. It is known that if the stream may consist of arbitrary data arriving in an arbitrary order, then any 1-pass algorithm requires Ω(1/ɛ 2) bits of space to perform this task
Results 1 - 10
of
7,078