• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 7,078
Next 10 →

Counting Distinct Elements in a Data Stream

by Ziv Bar-yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, Luca Trevisan , 2002
"... We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± epsilon. Our algorithms improve upon known algorithms for this problem, and offer a spectrum of time/space tradeoffs. ..."
Abstract - Cited by 191 (4 self) - Add to MetaCart
We present three algorithms to count the number of distinct elements in a data stream to within a factor of 1 ± epsilon. Our algorithms improve upon known algorithms for this problem, and offer a spectrum of time/space tradeoffs.

An Optimal Algorithm for the Distinct Elements Problem

by Daniel M. Kane, Jelani Nelson, David P. Woodruff
"... We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing, ne ..."
Abstract - Cited by 67 (7 self) - Add to MetaCart
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing

Streaming Algorithms for Robust Distinct Elements

by Di Chen, Qin Zhang
"... We study the problem of estimating distinct elements in the data stream model, which has a central role in traffic mon-itoring, query optimization, data mining and data integra-tion. Different from all previous work, we study the problem in the noisy data setting, where two different looking items i ..."
Abstract - Add to MetaCart
We study the problem of estimating distinct elements in the data stream model, which has a central role in traffic mon-itoring, query optimization, data mining and data integra-tion. Different from all previous work, we study the problem in the noisy data setting, where two different looking items

Tight lower bounds for the distinct elements problem

by Piotr Indyk - In FOCS , 2003
"... We prove strong lower bounds for the space complexity of ¢¤£¦¥¨§� ©-approximating the number of distinct elements �� � in a data stream. Let � be the size of the universe from which the stream elements are drawn. We show that any one-pass streaming algorithm for ¢¤£¦¥¨§� ©-approximating � � must us ..."
Abstract - Cited by 59 (10 self) - Add to MetaCart
We prove strong lower bounds for the space complexity of ¢¤£¦¥¨§� ©-approximating the number of distinct elements �� � in a data stream. Let � be the size of the universe from which the stream elements are drawn. We show that any one-pass streaming algorithm for ¢¤£¦¥¨§� ©-approximating � � must

Investor Protection and Corporate Governance

by Rafael La Porta, Florencio Lopez-de-Silanes, Andrei Shleifer, Robert Vishny , 1999
"... Recent research on corporate governance has documented large differences between countries in ownership concentration in publicly traded firms, in the breadth and depth of financial markets, and in the access of firms to external finance. We suggest that there is a common element to the explanations ..."
Abstract - Cited by 590 (11 self) - Add to MetaCart
Recent research on corporate governance has documented large differences between countries in ownership concentration in publicly traded firms, in the breadth and depth of financial markets, and in the access of firms to external finance. We suggest that there is a common element

A classification and comparison framework for software architecture description languages

by Nenad Medvidovic, Richard N. Taylor - IEEE Transactions on Software Engineering , 2000
"... Software architectures shift the focus of developers from lines-of-code to coarser-grained architectural elements and their overall interconnection structure. Architecture description languages (ADLs) have been proposed as modeling notations to support architecture-based development. There is, howev ..."
Abstract - Cited by 855 (59 self) - Add to MetaCart
Software architectures shift the focus of developers from lines-of-code to coarser-grained architectural elements and their overall interconnection structure. Architecture description languages (ADLs) have been proposed as modeling notations to support architecture-based development. There is

Probabilistic Counting Algorithms for Data Base Applications

by Philippe Flajolet, G. N. Martin, G. Nigel Martin , 1985
"... This paper introduces a class of probabilistic counting lgorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words) a ..."
Abstract - Cited by 444 (6 self) - Add to MetaCart
This paper introduces a class of probabilistic counting lgorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words

APPARC PCA5 - Parallelizing Distinct Element Simulations

by R. Knecht, G.A. Kohring
"... The parallelization of distinct element models is discussed with particular attention being paid to the problem of load balancing. One method for solving the load balancing problem through the use of a local, dynamic procedure is given and an implementation on the Cray-T3D and the Intel Paragon is d ..."
Abstract - Add to MetaCart
The parallelization of distinct element models is discussed with particular attention being paid to the problem of load balancing. One method for solving the load balancing problem through the use of a local, dynamic procedure is given and an implementation on the Cray-T3D and the Intel Paragon

Data Streams as Random Permutations: the Distinct Element Problem

by Ahmed Helmi, Jérémie Lumbroso, Conrado Martínez, Alfredo Viola
"... In this paper, we show that data streams can sometimes usefully be studied as random permutations. This simple observation allows a wealth of classical and recent results from combinatorics to be recycled, with minimal effort, as estimators for various statistics over data streams. We illustrate thi ..."
Abstract - Add to MetaCart
this by introducing RECORDINALITY, an algorithm which estimates the number of distinct elements in a stream by counting the number of k-records occurring in it. The algorithm has a score of interesting properties, such as providing a random sample of the set underlying the stream. To the best of our knowledge, a

The Average-Case Complexity of Counting Distinct Elements

by David P. Woodruff
"... We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1±ɛ) factor. It is known that if the stream may consist of arbitrary data arriving in an arbitrary order, then any 1-pass algorithm requires Ω(1/ɛ 2) bits of space to perform this task. T ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1±ɛ) factor. It is known that if the stream may consist of arbitrary data arriving in an arbitrary order, then any 1-pass algorithm requires Ω(1/ɛ 2) bits of space to perform this task
Next 10 →
Results 1 - 10 of 7,078
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University